[dpdk-dev] [PATCH v4 4/5] net/ice: fix vector rx burst for ice

Zhang, Qi Z qi.z.zhang at intel.com
Thu Sep 17 13:03:19 CEST 2020



> -----Original Message-----
> From: Guo, Jia <jia.guo at intel.com>
> Sent: Thursday, September 17, 2020 3:59 PM
> To: Yang, Qiming <qiming.yang at intel.com>; Xing, Beilei
> <beilei.xing at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>; Wu, Jingjing
> <jingjing.wu at intel.com>; Wang, Haiyue <haiyue.wang at intel.com>
> Cc: Zhao1, Wei <wei.zhao1 at intel.com>; Richardson, Bruce
> <bruce.richardson at intel.com>; dev at dpdk.org; Guo, Jia <jia.guo at intel.com>;
> Zhang, Helin <helin.zhang at intel.com>; mb at smartsharesystems.com; Yigit,
> Ferruh <ferruh.yigit at intel.com>; stephen at networkplumber.org;
> barbette at kth.se; Han, YingyaX <yingyax.han at intel.com>
> Subject: [PATCH v4 4/5] net/ice: fix vector rx burst for ice
> 
> The limitation of burst size in vector rx was removed, since it should retrieve as
> much received packets as possible. And also the scattered receive path should
> use a wrapper function to achieve the goal of burst maximizing. And do some
> code cleaning for vector rx path.
> 
> Bugzilla ID: 516
> Fixes: c68a52b8b38c ("net/ice: support vector SSE in Rx")
> Fixes: ae60d3c9b227 ("net/ice: support Rx AVX2 vector")
> 
> Signed-off-by: Jeff Guo <jia.guo at intel.com>
> Tested-by: Yingya Han <yingyax.han at intel.com>
> ---
>  drivers/net/ice/ice_rxtx.h          |  1 +
>  drivers/net/ice/ice_rxtx_vec_avx2.c | 23 ++++++------
> drivers/net/ice/ice_rxtx_vec_sse.c  | 56 +++++++++++++++++++----------
>  3 files changed, 49 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index
> 2fdcfb7d0..3ef5f300d 100644
> --- a/drivers/net/ice/ice_rxtx.h
> +++ b/drivers/net/ice/ice_rxtx.h
> @@ -35,6 +35,7 @@
>  #define ICE_MAX_RX_BURST            ICE_RXQ_REARM_THRESH
>  #define ICE_TX_MAX_FREE_BUF_SZ      64
>  #define ICE_DESCS_PER_LOOP          4
> +#define ICE_DESCS_PER_LOOP_AVX	    8

No need to expose this if no external link, better to keep all avx stuff inside avx.c

> 
>  #define ICE_FDIR_PKT_LEN	512
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index be50677c2..843e4f32a 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -29,7 +29,7 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)
>  			__m128i dma_addr0;
> 
>  			dma_addr0 = _mm_setzero_si128();
> -			for (i = 0; i < ICE_DESCS_PER_LOOP; i++) {
> +			for (i = 0; i < ICE_DESCS_PER_LOOP_AVX; i++) {
>  				rxep[i].mbuf = &rxq->fake_mbuf;
>  				_mm_store_si128((__m128i *)&rxdp[i].read,
>  						dma_addr0);
> @@ -132,12 +132,17 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)
>  	ICE_PCI_REG_WRITE(rxq->qrx_tail, rx_id);  }
> 
> +/**
> + * vPMD raw receive routine, only accept(nb_pkts >=
> +ICE_DESCS_PER_LOOP_AVX)
> + *
> + * Notice:
> + * - nb_pkts < ICE_DESCS_PER_LOOP_AVX, just return no packet
> + * - floor align nb_pkts to a ICE_DESCS_PER_LOOP_AVX power-of-two  */

The comment is misleading, it looks like we are going to floor align nb_pkts to 2^8, better to reword .

>  static inline uint16_t
>  _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf
> **rx_pkts,
>  			    uint16_t nb_pkts, uint8_t *split_packet)  { -#define
> ICE_DESCS_PER_LOOP_AVX 8
> -
>  	const uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl;
>  	const __m256i mbuf_init = _mm256_set_epi64x(0, 0,
>  			0, rxq->mbuf_initializer);
> @@ -603,10 +608,6 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue
> *rxq, struct rte_mbuf **rx_pkts,
>  	return received;
>  }
> 
> -/*
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
> - */
>  uint16_t
>  ice_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
>  		       uint16_t nb_pkts)
> @@ -616,8 +617,6 @@ ice_recv_pkts_vec_avx2(void *rx_queue, struct
> rte_mbuf **rx_pkts,
> 
>  /**
>   * vPMD receive routine that reassembles single burst of 32 scattered
> packets
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
>   */

Why we need to remove this? is it still true for this function?

>  static uint16_t
>  ice_recv_scattered_burst_vec_avx2(void *rx_queue, struct rte_mbuf
> **rx_pkts, @@ -626,6 +625,9 @@ ice_recv_scattered_burst_vec_avx2(void
> *rx_queue, struct rte_mbuf **rx_pkts,
>  	struct ice_rx_queue *rxq = rx_queue;
>  	uint8_t split_flags[ICE_VPMD_RX_BURST] = {0};
> 
> +	/* split_flags only can support max of ICE_VPMD_RX_BURST */
> +	nb_pkts = RTE_MIN(nb_pkts, ICE_VPMD_RX_BURST);

Is this necessary?  the only consumer of this function is ice_recv_scattered_pkts_vec_avx2, 
I think nb_pkts <= ICE_VPMD_RX_BURST it already be guaranteed.
> +
>  	/* get some new buffers */
>  	uint16_t nb_bufs = _ice_recv_raw_pkts_vec_avx2(rxq, rx_pkts, nb_pkts,
>  						       split_flags);
> @@ -657,9 +659,6 @@ ice_recv_scattered_burst_vec_avx2(void *rx_queue,
> struct rte_mbuf **rx_pkts,
> 
>  /**
>   * vPMD receive routine that reassembles scattered packets.
> - * Main receive routine that can handle arbitrary burst sizes
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
>   */

Why we need to remove this? isn't it the main routine that be able to handle arbitrary burst size?

Btw, I will suggest all AVX2 changes can be in a separate patch, because this looks like some code clean and fix.
its not related with the main purpose of the patch set.


More information about the dev mailing list