[dpdk-dev] [External] Re: [PATCH v2] app/testpmd: flowgen support ip and udp fields
王志宏
wangzhihong.wzh at bytedance.com
Tue Aug 10 09:57:27 CEST 2021
Thanks for the review Ferruh :)
On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit at intel.com> wrote:
>
> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> > This patch aims to:
> > 1. Add flexibility by supporting IP & UDP src/dst fields
>
> What is the reason/"use case" of this flexibility?
The purpose is to emulate pkt generator behaviors.
>
> > 2. Improve multi-core performance by using per-core vars>
>
> On multi core this also has syncronization problem, OK to make it per-core. Do
> you have any observed performance difference, if so how much is it?
Huge difference, one example: 8 core flowgen -> rxonly results: 43
Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
depending on system configuration".
>
> And can you please separate this to its own patch? This can be before ip/udp update.
Will do.
>
> > v2: fix assigning ip header cksum
> >
>
> +1 to update, can you please make it as seperate patch?
Sure.
>
> So overall this can be a patchset with 4 patches:
> 1- Fix retry logic (nb_rx -> nb_pkt)
> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> 3- User per-core varible (for 'next_flow')
> 4- Support ip/udp src/dst variaty of packets
>
Great summary. Thanks a lot.
> > Signed-off-by: Zhihong Wang <wangzhihong.wzh at bytedance.com>
> > ---
> > app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
> > 1 file changed, 86 insertions(+), 51 deletions(-)
> >
>
> <...>
>
> > @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> > }
> > pkts_burst[nb_pkt] = pkt;
> >
> > - next_flow = (next_flow + 1) % cfg_n_flows;
> > + if (++next_udp_dst < cfg_n_udp_dst)
> > + continue;
> > + next_udp_dst = 0;
> > + if (++next_udp_src < cfg_n_udp_src)
> > + continue;
> > + next_udp_src = 0;
> > + if (++next_ip_dst < cfg_n_ip_dst)
> > + continue;
> > + next_ip_dst = 0;
> > + if (++next_ip_src < cfg_n_ip_src)
> > + continue;
> > + next_ip_src = 0;
>
> What is the logic here, can you please clarifiy the packet generation logic both
> in a comment here and in the commit log?
It's round-robin field by field. Will add the comments.
>
> > }
> >
> > nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
> > /*
> > * Retry if necessary
> > */
> > - if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> > + if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> > retry = 0;
> > - while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> > + while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> > rte_delay_us(burst_tx_delay_time);
> > nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> > - &pkts_burst[nb_tx], nb_rx - nb_tx);
> > + &pkts_burst[nb_tx], nb_pkt - nb_tx);
> > }
>
> +1 to this fix, thanks for it. But can you please make a seperate patch for
> this, with proper 'Fixes:' tag etc..
Ok.
>
> > }
> > - fs->tx_packets += nb_tx;
> >
> > inc_tx_burst_stats(fs, nb_tx);
> > - if (unlikely(nb_tx < nb_pkt)) {
> > - /* Back out the flow counter. */
> > - next_flow -= (nb_pkt - nb_tx);
> > - while (next_flow < 0)
> > - next_flow += cfg_n_flows;
> > + fs->tx_packets += nb_tx;
> > + /* Catch up flow idx by actual sent. */
> > + for (i = 0; i < nb_tx; ++i) {
> > + RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> > + if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> > + continue;
> > + RTE_PER_LCORE(_next_udp_dst) = 0;
> > + RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> > + if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> > + continue;
> > + RTE_PER_LCORE(_next_udp_src) = 0;
> > + RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> > + if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> > + continue;
> > + RTE_PER_LCORE(_next_ip_dst) = 0;
> > + RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> > + if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> > + continue;
> > + RTE_PER_LCORE(_next_ip_src) = 0;
> > + }
>
> Why per-core variables are not used in forward function, but local variables
> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
> impact?
>
> And why not directly assign from local variables to per-core variables, but have
> above catch up loop?
>
>
Local vars are for generating pkts, global ones catch up finally when
nb_tx is clear.
So flow indexes only increase by actual sent pkt number.
It serves the same purpose of the original "/* backout the flow counter */".
My math isn't good enough to make it look more intelligent though.
More information about the dev
mailing list