[dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance

Zhang, Qi Z qi.z.zhang at intel.com
Wed Nov 21 22:24:26 CET 2018



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, November 20, 2018 2:54 PM
> To: Zhang, Qi Z <qi.z.zhang at intel.com>; Richardson, Bruce
> <bruce.richardson at intel.com>; Wiles, Keith <keith.wiles at intel.com>
> Cc: dev at dpdk.org; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Iremonger, Bernard
> <bernard.iremonger at intel.com>; stable at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Tuesday, November 20, 2018 5:26 PM
> > To: Zhang, Qi Z <qi.z.zhang at intel.com>; Richardson, Bruce
> > <bruce.richardson at intel.com>; Wiles, Keith <keith.wiles at intel.com>
> > Cc: dev at dpdk.org; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Iremonger,
> > Bernard <bernard.iremonger at intel.com>; stable at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > performance
> >
> >
> >
> > > -----Original Message-----
> > > From: Zhang, Qi Z
> > > Sent: Tuesday, November 20, 2018 4:58 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Richardson,
> > > Bruce <bruce.richardson at intel.com>; Wiles, Keith
> > > <keith.wiles at intel.com>
> > > Cc: dev at dpdk.org; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Iremonger,
> > > Bernard <bernard.iremonger at intel.com>; stable at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > performance
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Tuesday, November 20, 2018 1:17 AM
> > > > To: Zhang, Qi Z <qi.z.zhang at intel.com>; Richardson, Bruce
> > > > <bruce.richardson at intel.com>; Wiles, Keith <keith.wiles at intel.com>
> > > > Cc: dev at dpdk.org; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Iremonger,
> > > > Bernard <bernard.iremonger at intel.com>; Zhang, Qi Z
> > > > <qi.z.zhang at intel.com>; stable at dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > > performance
> > > >
> > > > Hi Qi,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Qi Zhang
> > > > > Sent: Tuesday, November 20, 2018 4:46 AM
> > > > > To: Richardson, Bruce <bruce.richardson at intel.com>; Wiles, Keith
> > > > > <keith.wiles at intel.com>
> > > > > Cc: dev at dpdk.org; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Iremonger,
> > > > > Bernard <bernard.iremonger at intel.com>; Zhang, Qi Z
> > > > > <qi.z.zhang at intel.com>; stable at dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > > > performance
> > > > >
> > > > > The patch optimizes the mac swap operation by taking advantage
> > > > > of SSE instructions, it only impacts x86 platform.
> > > > >
> > > > > Cc: stable at dpdk.org
> > > > >
> > > > > Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> > > > > ---
> > > > >  app/test-pmd/macswap.c | 16 +++++++++++++++-
> > > > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> > > > > index
> > > > > a8384d5b8..0722782b0 100644
> > > > > --- a/app/test-pmd/macswap.c
> > > > > +++ b/app/test-pmd/macswap.c
> > > > > @@ -78,7 +78,6 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > > > >  	struct rte_port  *txp;
> > > > >  	struct rte_mbuf  *mb;
> > > > >  	struct ether_hdr *eth_hdr;
> > > > > -	struct ether_addr addr;
> > > > >  	uint16_t nb_rx;
> > > > >  	uint16_t nb_tx;
> > > > >  	uint16_t i;
> > > > > @@ -95,6 +94,15 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > > > >  	start_tsc = rte_rdtsc();
> > > > >  #endif
> > > > >
> > > > > +#ifdef RTE_ARCH_X86
> > > > > +	__m128i addr;
> > > > > +	__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12,
> > > > > +					5, 4, 3, 2,
> > > > > +					1, 0, 11, 10,
> > > > > +					9, 8, 7, 6);
> > > > > +#else
> > > > > +	struct ether_addr addr;
> > > > > +#endif
> > > >
> > > > I think it would better to place IA specific code into a separate
> > > > fnction (and probably into a separate .h file).
> > >
> > > OK, I will think about how to rework this.
> >
> > Ideally would be good to have an generic one, and IA optimized version.
> >
> > >
> > > > BTW, just curious what % of improvement it gives?
> > >
> > > So far , the only server I can test is a 1.6GHz Broadwell server with 2 ports on
> 1 i40e 25G.
> > > The macswap performance is increase from 16.8mpps to 20mpps (about
> > > 19% improvement)

I need to add a notice here, I found previous test is running on CPU from remote socket.
For the test on CPU from local socket on the same server, actually the mac swap performance is improved from 23.34 to 26.36, its about 12.9% increase, but still considerable.

> >
> > Quite a lot, definitely looks like worth it.
> 
> You probably can squeeze few more cycles doing it in bulks of 4 or so.

it's a good idea, based on my experience I can get more than 4% increase by batch with 4, 
it can reach 27.46mpps, so now its 17.7% increase, I will send patch later, please help to polish:)

Thanks
Qi

> Konstantin



More information about the dev mailing list