[dpdk-dev] [External] Re: [PATCH v2] app/testpmd: flowgen support ip and udp fields

Ferruh Yigit ferruh.yigit at intel.com
Tue Aug 10 11:12:37 CEST 2021


On 8/10/2021 8:57 AM, 王志宏 wrote:
> Thanks for the review Ferruh :)
> 
> On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit at intel.com> wrote:
>>
>> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
>>> This patch aims to:
>>>  1. Add flexibility by supporting IP & UDP src/dst fields
>>
>> What is the reason/"use case" of this flexibility?
> 
> The purpose is to emulate pkt generator behaviors.
> 

'flowgen' forwarding is already to emulate pkt generator, but it was only
changing destination IP.

What additional benefit does changing udp ports of the packets brings? What is
your usecase for this change?

>>
>>>  2. Improve multi-core performance by using per-core vars>
>>
>> On multi core this also has syncronization problem, OK to make it per-core. Do
>> you have any observed performance difference, if so how much is it?
> 
> Huge difference, one example: 8 core flowgen -> rxonly results: 43
> Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> depending on system configuration".
> 

Thanks for clarification.

>>
>> And can you please separate this to its own patch? This can be before ip/udp update.
> 
> Will do.
> 
>>
>>> v2: fix assigning ip header cksum
>>>
>>
>> +1 to update, can you please make it as seperate patch?
> 
> Sure.
> 
>>
>> So overall this can be a patchset with 4 patches:
>> 1- Fix retry logic (nb_rx -> nb_pkt)
>> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
>> 3- User per-core varible (for 'next_flow')
>> 4- Support ip/udp src/dst variaty of packets
>>
> 
> Great summary. Thanks a lot.
> 
>>> Signed-off-by: Zhihong Wang <wangzhihong.wzh at bytedance.com>
>>> ---
>>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
>>>  1 file changed, 86 insertions(+), 51 deletions(-)
>>>
>>
>> <...>
>>
>>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>>>               }
>>>               pkts_burst[nb_pkt] = pkt;
>>>
>>> -             next_flow = (next_flow + 1) % cfg_n_flows;
>>> +             if (++next_udp_dst < cfg_n_udp_dst)
>>> +                     continue;
>>> +             next_udp_dst = 0;
>>> +             if (++next_udp_src < cfg_n_udp_src)
>>> +                     continue;
>>> +             next_udp_src = 0;
>>> +             if (++next_ip_dst < cfg_n_ip_dst)
>>> +                     continue;
>>> +             next_ip_dst = 0;
>>> +             if (++next_ip_src < cfg_n_ip_src)
>>> +                     continue;
>>> +             next_ip_src = 0;
>>
>> What is the logic here, can you please clarifiy the packet generation logic both
>> in a comment here and in the commit log?
> 
> It's round-robin field by field. Will add the comments.
> 

Thanks. If the receiving end is doing RSS based on IP address, dst address will
change in every 100 packets and src will change in every 10000 packets. This is
a slight behavior change.

When it was only dst ip, it was simple to just increment it, not sure about it
in this case. I wonder if we should set all randomly for each packet. I don't
know what is the better logic here, we can discuss it more in the next version.

>>
>>>       }
>>>
>>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
>>>       /*
>>>        * Retry if necessary
>>>        */
>>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
>>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
>>>               retry = 0;
>>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
>>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
>>>                       rte_delay_us(burst_tx_delay_time);
>>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
>>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
>>>               }
>>
>> +1 to this fix, thanks for it. But can you please make a seperate patch for
>> this, with proper 'Fixes:' tag etc..
> 
> Ok.
> 
>>
>>>       }
>>> -     fs->tx_packets += nb_tx;
>>>
>>>       inc_tx_burst_stats(fs, nb_tx);
>>> -     if (unlikely(nb_tx < nb_pkt)) {
>>> -             /* Back out the flow counter. */
>>> -             next_flow -= (nb_pkt - nb_tx);
>>> -             while (next_flow < 0)
>>> -                     next_flow += cfg_n_flows;
>>> +     fs->tx_packets += nb_tx;
>>> +     /* Catch up flow idx by actual sent. */
>>> +     for (i = 0; i < nb_tx; ++i) {
>>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
>>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
>>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
>>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_udp_src) = 0;
>>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
>>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
>>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
>>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_ip_src) = 0;
>>> +     }
>>
>> Why per-core variables are not used in forward function, but local variables
>> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
>> impact?
>>
>> And why not directly assign from local variables to per-core variables, but have
>> above catch up loop?
>>
>>
> 
> Local vars are for generating pkts, global ones catch up finally when
> nb_tx is clear.

Why you are not using global ones to generate packets? This removes the need for
catch up?

> So flow indexes only increase by actual sent pkt number.
> It serves the same purpose of the original "/* backout the flow counter */".
> My math isn't good enough to make it look more intelligent though.
> 

Maybe I am missing something, for this case why not just assign back from locals
to globals?


More information about the dev mailing list