[dpdk-dev] [PATCH v2 06/15] app/testpmd: add common fwd wrapper function
Xueming(Steven) Li
xuemingl at nvidia.com
Wed Aug 18 16:08:53 CEST 2021
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk at gmail.com>
> Sent: Wednesday, August 18, 2021 7:48 PM
> To: Xueming(Steven) Li <xuemingl at nvidia.com>
> Cc: Jack Min <jackmin at nvidia.com>; dpdk-dev <dev at dpdk.org>; Xiaoyun Li <xiaoyun.li at intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 06/15] app/testpmd: add common fwd wrapper function
>
> On Wed, Aug 18, 2021 at 4:57 PM Xueming(Steven) Li <xuemingl at nvidia.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk at gmail.com>
> > > Sent: Tuesday, August 17, 2021 5:37 PM
> > > To: Xueming(Steven) Li <xuemingl at nvidia.com>
> > > Cc: Jack Min <jackmin at nvidia.com>; dpdk-dev <dev at dpdk.org>; Xiaoyun
> > > Li <xiaoyun.li at intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 06/15] app/testpmd: add common fwd
> > > wrapper function
> > >
> > > On Wed, Aug 11, 2021 at 7:35 PM Xueming Li <xuemingl at nvidia.com> wrote:
> > > >
> > > > From: Xiaoyu Min <jackmin at nvidia.com>
> > > >
> > > > Added an inline common wrapper function for all fwd engines which
> > > > do the following in common:
> > > >
> > > > 1. get_start_cycles
> > > > 2. rte_eth_rx_burst(...,nb_pkt_per_burst)
> > > > 3. if rxq_share do forward_shared_rxq(), otherwise do fwd directly 4.
> > > > get_end_cycle
> > > >
> > > > Signed-off-by: Xiaoyu Min <jackmin at nvidia.com>
> > > > ---
> > > > app/test-pmd/testpmd.h | 24 ++++++++++++++++++++++++
> > > > 1 file changed, 24 insertions(+)
> > > >
> > > > diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index
> > > > 13141dfed9..b685ac48d6 100644
> > > > --- a/app/test-pmd/testpmd.h
> > > > +++ b/app/test-pmd/testpmd.h
> > > > @@ -1022,6 +1022,30 @@ void add_tx_dynf_callback(portid_t portid);
> > > > void remove_tx_dynf_callback(portid_t portid); int
> > > > update_jumbo_frame_offload(portid_t portid);
> > > >
> > > > +static inline void
> > > > +do_burst_fwd(struct fwd_stream *fs, packet_fwd_cb fwd) {
> > > > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> > > > + uint16_t nb_rx;
> > > > + uint64_t start_tsc = 0;
> > > > +
> > > > + get_start_cycles(&start_tsc);
> > > > +
> > > > + /*
> > > > + * Receive a burst of packets and forward them.
> > > > + */
> > > > + nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
> > > > + pkts_burst, nb_pkt_per_burst);
> > > > + inc_rx_burst_stats(fs, nb_rx);
> > > > + if (unlikely(nb_rx == 0))
> > > > + return;
> > > > + if (unlikely(rxq_share > 0))
> > >
> > > See below. It reads a global memory.
> > >
> > > > + forward_shared_rxq(fs, nb_rx, pkts_burst, fwd);
> > > > + else
> > > > + (*fwd)(fs, nb_rx, pkts_burst);
> > >
> > > New function pointer in fastpath.
> > >
> > > IMO, We should not create performance regression for the existing forward engine.
> > > Can we have a new forward engine just for shared memory testing?
> >
> > Yes, fully aware of the performance concern, the global could be defined around record_core_cycles to minimize the impacts.
> > Based on test data, the impacts almost invisible in legacy mode.
>
> Are you saying there is zero % regression? If not, could you share the data?
Almost zero, here is a quick single core result of rxonly:
32.2Mpps, 58.9cycles/packet
Revert the patch to rxonly.c:
32.1Mpps 59.9cycles/packet
The result doesn't make sense and I realized that I used batch mbuf free, apply it now:
32.2Mpps, 58.9cycles/packet
There were small digit jumps between testpmd restart, I picked the best one.
The result is almost same, seems the cost of each packet is small enough.
BTW, I'm testing with default burst size and queue depth.
>
> >
> > From test perspective, better to have all forward engine to verify
> > shared rxq, test team want to run the regression with less impacts. Hope to have a solution to utilize all forwarding engines
> seamlessly.
>
> Yes. it good goal. testpmd forward performance using as synthetic bench everyone.
> I think, we are aligned to not have any regression for the generic forward engine.
>
> >
> > >
> > > > + get_end_cycles(fs, start_tsc); }
> > > > +
> > > > /*
> > > > * Work-around of a compilation error with ICC on invocations of the
> > > > * rte_be_to_cpu_16() function.
> > > > --
> > > > 2.25.1
> > > >
More information about the dev
mailing list