[dpdk-dev] [PATCH] net/i40e: add additional prefetch instructions for bulk rx

Vladyslav Buslov Vladyslav.Buslov at harmonicinc.com
Mon Oct 10 19:05:30 CEST 2016


> -----Original Message-----
> From: Wu, Jingjing [mailto:jingjing.wu at intel.com]
> Sent: Monday, October 10, 2016 4:26 PM
> To: Yigit, Ferruh; Vladyslav Buslov; Zhang, Helin
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/i40e: add additional prefetch
> instructions for bulk rx
> 
> 
> 
> > -----Original Message-----
> > From: Yigit, Ferruh
> > Sent: Wednesday, September 14, 2016 9:25 PM
> > To: Vladyslav Buslov <vladyslav.buslov at harmonicinc.com>; Zhang, Helin
> > <helin.zhang at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] net/i40e: add additional prefetch
> > instructions for bulk rx
> >
> > On 7/14/2016 6:27 PM, Vladyslav Buslov wrote:
> > > Added prefetch of first packet payload cacheline in
> > > i40e_rx_scan_hw_ring Added prefetch of second mbuf cacheline in
> > > i40e_rx_alloc_bufs
> > >
> > > Signed-off-by: Vladyslav Buslov <vladyslav.buslov at harmonicinc.com>
> > > ---
> > >  drivers/net/i40e/i40e_rxtx.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > > b/drivers/net/i40e/i40e_rxtx.c index d3cfb98..e493fb4 100644
> > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > @@ -1003,6 +1003,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue
> *rxq)
> > >                 /* Translate descriptor info to mbuf parameters */
> > >                 for (j = 0; j < nb_dd; j++) {
> > >                         mb = rxep[j].mbuf;
> > > +                       rte_prefetch0(RTE_PTR_ADD(mb->buf_addr,
> > RTE_PKTMBUF_HEADROOM));
> 
> Why did prefetch here? I think if application need to deal with packet, it is
> more suitable to put it in application.
> 
> > >                         qword1 = rte_le_to_cpu_64(\
> > >                                 rxdp[j].wb.qword1.status_error_len);
> > >                         pkt_len = ((qword1 &
> > I40E_RXD_QW1_LENGTH_PBUF_MASK) >>
> > > @@ -1086,9 +1087,11 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue
> *rxq)
> > >
> > >         rxdp = &rxq->rx_ring[alloc_idx];
> > >         for (i = 0; i < rxq->rx_free_thresh; i++) {
> > > -               if (likely(i < (rxq->rx_free_thresh - 1)))
> > > +               if (likely(i < (rxq->rx_free_thresh - 1))) {
> > >                         /* Prefetch next mbuf */
> > > -                       rte_prefetch0(rxep[i + 1].mbuf);
> > > +                       rte_prefetch0(&rxep[i + 1].mbuf->cacheline0);
> > > +                       rte_prefetch0(&rxep[i + 1].mbuf->cacheline1);
> > > +               }
> Agree with this change. And when I test it by testpmd with iofwd, no
> performance increase is observed but minor decrease.
> Can you share will us when it will benefit the performance in your scenario ?
> 
> 
> Thanks
> Jingjing

Hello Jingjing,

Thanks for code review.

My use case: We have simple distributor thread that receives packets from port and distributes them among worker threads according to VLAN and MAC address hash. 

While working on performance optimization we determined that most of CPU usage of this thread is in DPDK.
As and optimization we decided to switch to rx burst alloc function, however that caused additional performance degradation compared to scatter rx mode.
In profiler two major culprits were:
  1. Access to packet data Eth header in application code. (cache miss)
  2. Setting next packet descriptor field to NULL in DPDK i40e_rx_alloc_bufs code. (this field is in second descriptor cache line that was not prefetched)
After applying my fixes performance improved compared to scatter rx mode.

I assumed that prefetch of first cache line of packet data belongs to DPDK because it is done in scatter rx mode. (in i40e_recv_scattered_pkts)
It can be moved to application side but IMO it is better to be consistent across all rx modes.

Regards,
Vladyslav


More information about the dev mailing list