[dpdk-dev] [External] Re: [PATCH v2] app/testpmd: flowgen support ip and udp fields

王志宏 wangzhihong.wzh at bytedance.com
Thu Aug 12 11:32:52 CEST 2021


On Wed, Aug 11, 2021 at 6:31 PM Ferruh Yigit <ferruh.yigit at intel.com> wrote:
>
> On 8/11/2021 3:48 AM, 王志宏 wrote:
> > On Tue, Aug 10, 2021 at 5:12 PM Ferruh Yigit <ferruh.yigit at intel.com> wrote:
> >>
> >> On 8/10/2021 8:57 AM, 王志宏 wrote:
> >>> Thanks for the review Ferruh :)
> >>>
> >>> On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit at intel.com> wrote:
> >>>>
> >>>> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> >>>>> This patch aims to:
> >>>>>  1. Add flexibility by supporting IP & UDP src/dst fields
> >>>>
> >>>> What is the reason/"use case" of this flexibility?
> >>>
> >>> The purpose is to emulate pkt generator behaviors.
> >>>
> >>
> >> 'flowgen' forwarding is already to emulate pkt generator, but it was only
> >> changing destination IP.
> >>
> >> What additional benefit does changing udp ports of the packets brings? What is
> >> your usecase for this change?
> >
> > Pkt generators like pktgen/trex/ixia/spirent can change various fields
> > including ip/udp src/dst.
> >
>
> But testpmd is not packet generator, it has very simple 'flowgen' forwarding
> engine, I would like to understand motivation to make it more complex.

I agree this *simplicity* point. In fact my sole intention is to make
flowgen useable for multi-core test. I'll keep the original setup in
the next patch.

>
> > Keeping the cfg_n_* while setting cfg_n_ip_dst = 1024 and others = 1
> > makes the default behavior exactly unchanged. Do you think it makes
> > sense?
> >
> >>
> >>>>
> >>>>>  2. Improve multi-core performance by using per-core vars>
> >>>>
> >>>> On multi core this also has syncronization problem, OK to make it per-core. Do
> >>>> you have any observed performance difference, if so how much is it?
> >>>
> >>> Huge difference, one example: 8 core flowgen -> rxonly results: 43
> >>> Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> >>> depending on system configuration".
> >>>
> >>
> >> Thanks for clarification.
> >>
> >>>>
> >>>> And can you please separate this to its own patch? This can be before ip/udp update.
> >>>
> >>> Will do.
> >>>
> >>>>
> >>>>> v2: fix assigning ip header cksum
> >>>>>
> >>>>
> >>>> +1 to update, can you please make it as seperate patch?
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> So overall this can be a patchset with 4 patches:
> >>>> 1- Fix retry logic (nb_rx -> nb_pkt)
> >>>> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> >>>> 3- User per-core varible (for 'next_flow')
> >>>> 4- Support ip/udp src/dst variaty of packets
> >>>>
> >>>
> >>> Great summary. Thanks a lot.
> >>>
> >>>>> Signed-off-by: Zhihong Wang <wangzhihong.wzh at bytedance.com>
> >>>>> ---
> >>>>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
> >>>>>  1 file changed, 86 insertions(+), 51 deletions(-)
> >>>>>
> >>>>
> >>>> <...>
> >>>>
> >>>>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> >>>>>               }
> >>>>>               pkts_burst[nb_pkt] = pkt;
> >>>>>
> >>>>> -             next_flow = (next_flow + 1) % cfg_n_flows;
> >>>>> +             if (++next_udp_dst < cfg_n_udp_dst)
> >>>>> +                     continue;
> >>>>> +             next_udp_dst = 0;
> >>>>> +             if (++next_udp_src < cfg_n_udp_src)
> >>>>> +                     continue;
> >>>>> +             next_udp_src = 0;
> >>>>> +             if (++next_ip_dst < cfg_n_ip_dst)
> >>>>> +                     continue;
> >>>>> +             next_ip_dst = 0;
> >>>>> +             if (++next_ip_src < cfg_n_ip_src)
> >>>>> +                     continue;
> >>>>> +             next_ip_src = 0;
> >>>>
> >>>> What is the logic here, can you please clarifiy the packet generation logic both
> >>>> in a comment here and in the commit log?
> >>>
> >>> It's round-robin field by field. Will add the comments.
> >>>
> >>
> >> Thanks. If the receiving end is doing RSS based on IP address, dst address will
> >> change in every 100 packets and src will change in every 10000 packets. This is
> >> a slight behavior change.
> >>
> >> When it was only dst ip, it was simple to just increment it, not sure about it
> >> in this case. I wonder if we should set all randomly for each packet. I don't
> >> know what is the better logic here, we can discuss it more in the next version.
> >
> > A more sophisticated pkt generator provides various options among
> > "step-by-step" / "random" / etc.
> >
> > But supporting multiple fields naturally brings this implicitly. It
> > won't be a problem as it can be configured by setting the cfg_n_* as
> > we discussed above.
> >
> > I think rte_rand() is a good option, anyway this can be tweaked easily
> > once the framework becomes shaped.
> >
>
> Can be done, but do we really want to add more packet generator capability to
> testpmd?
>
> >>
> >>>>
> >>>>>       }
> >>>>>
> >>>>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
> >>>>>       /*
> >>>>>        * Retry if necessary
> >>>>>        */
> >>>>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> >>>>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> >>>>>               retry = 0;
> >>>>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> >>>>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> >>>>>                       rte_delay_us(burst_tx_delay_time);
> >>>>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> >>>>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
> >>>>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
> >>>>>               }
> >>>>
> >>>> +1 to this fix, thanks for it. But can you please make a seperate patch for
> >>>> this, with proper 'Fixes:' tag etc..
> >>>
> >>> Ok.
> >>>
> >>>>
> >>>>>       }
> >>>>> -     fs->tx_packets += nb_tx;
> >>>>>
> >>>>>       inc_tx_burst_stats(fs, nb_tx);
> >>>>> -     if (unlikely(nb_tx < nb_pkt)) {
> >>>>> -             /* Back out the flow counter. */
> >>>>> -             next_flow -= (nb_pkt - nb_tx);
> >>>>> -             while (next_flow < 0)
> >>>>> -                     next_flow += cfg_n_flows;
> >>>>> +     fs->tx_packets += nb_tx;
> >>>>> +     /* Catch up flow idx by actual sent. */
> >>>>> +     for (i = 0; i < nb_tx; ++i) {
> >>>>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
> >>>>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_udp_src) = 0;
> >>>>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
> >>>>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_ip_src) = 0;
> >>>>> +     }
> >>>>
> >>>> Why per-core variables are not used in forward function, but local variables
> >>>> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
> >>>> impact?
> >>>>
> >>>> And why not directly assign from local variables to per-core variables, but have
> >>>> above catch up loop?
> >>>>
> >>>>
> >>>
> >>> Local vars are for generating pkts, global ones catch up finally when
> >>> nb_tx is clear.
> >>
> >> Why you are not using global ones to generate packets? This removes the need for
> >> catch up?
> >
> > When there are multiple fields, back out the overran index caused by
> > dropped packets is not that straightforward -- It's the "carry" issue
> > in adding.
> >
> >>
> >>> So flow indexes only increase by actual sent pkt number.
> >>> It serves the same purpose of the original "/* backout the flow counter */".
> >>> My math isn't good enough to make it look more intelligent though.
> >>>
> >>
> >> Maybe I am missing something, for this case why not just assign back from locals
> >> to globals?
> >
> > As above.
> >
> > However, this can be simplified if we discard the "back out"
> > mechanism: generate 32 pkts and send 20 of them while the rest 12 are
> > dropped, the difference is that is the idx gonna start from 21 or 33
> > next time?
> >
>
> I am not sure point of "back out", I think we can remove it unless there is no
> objection, so receiving end can recognize failed packets.
>


More information about the dev mailing list