[dpdk-dev] [PATCH v9 7/9] net/virtio: add vectorized packed ring Tx path

Liu, Yong yong.liu at intel.com
Fri Apr 24 15:33:45 CEST 2020



> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Friday, April 24, 2020 8:30 PM
> To: Liu, Yong <yong.liu at intel.com>; Ye, Xiaolong <xiaolong.ye at intel.com>;
> Wang, Zhihong <zhihong.wang at intel.com>
> Cc: dev at dpdk.org; Van Haaren, Harry <harry.van.haaren at intel.com>
> Subject: Re: [PATCH v9 7/9] net/virtio: add vectorized packed ring Tx path
> 
> 
> 
> On 4/24/20 11:24 AM, Marvin Liu wrote:
> > Optimize packed ring Tx path alike Rx path. Split Tx path into batch and
> 
> s/alike/like/ ?
> 
> > single Tx functions. Batch function is further optimized by AVX512
> > instructions.
> >
> > Signed-off-by: Marvin Liu <yong.liu at intel.com>
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.h
> b/drivers/net/virtio/virtio_ethdev.h
> > index 5c112cac7..b7d52d497 100644
> > --- a/drivers/net/virtio/virtio_ethdev.h
> > +++ b/drivers/net/virtio/virtio_ethdev.h
> > @@ -108,6 +108,9 @@ uint16_t virtio_recv_pkts_vec(void *rx_queue,
> struct rte_mbuf **rx_pkts,
> >  uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf
> **rx_pkts,
> >  		uint16_t nb_pkts);
> >
> > +uint16_t virtio_xmit_pkts_packed_vec(void *tx_queue, struct rte_mbuf
> **tx_pkts,
> > +		uint16_t nb_pkts);
> > +
> >  int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
> >
> >  void virtio_interrupt_handler(void *param);
> > diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> > index cf18fe564..f82fe8d64 100644
> > --- a/drivers/net/virtio/virtio_rxtx.c
> > +++ b/drivers/net/virtio/virtio_rxtx.c
> > @@ -2175,3 +2175,11 @@ virtio_recv_pkts_packed_vec(void *rx_queue
> __rte_unused,
> >  {
> >  	return 0;
> >  }
> > +
> > +__rte_weak uint16_t
> > +virtio_xmit_pkts_packed_vec(void *tx_queue __rte_unused,
> > +			    struct rte_mbuf **tx_pkts __rte_unused,
> > +			    uint16_t nb_pkts __rte_unused)
> > +{
> > +	return 0;
> > +}
> > diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c
> b/drivers/net/virtio/virtio_rxtx_packed_avx.c
> > index 8a7b459eb..c023ace4e 100644
> > --- a/drivers/net/virtio/virtio_rxtx_packed_avx.c
> > +++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c
> > @@ -23,6 +23,24 @@
> >  #define PACKED_FLAGS_MASK ((0ULL |
> VRING_PACKED_DESC_F_AVAIL_USED) << \
> >  	FLAGS_BITS_OFFSET)
> >
> > +/* reference count offset in mbuf rearm data */
> > +#define REFCNT_BITS_OFFSET ((offsetof(struct rte_mbuf, refcnt) - \
> > +	offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
> > +/* segment number offset in mbuf rearm data */
> > +#define SEG_NUM_BITS_OFFSET ((offsetof(struct rte_mbuf, nb_segs) - \
> > +	offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
> > +
> > +/* default rearm data */
> > +#define DEFAULT_REARM_DATA (1ULL << SEG_NUM_BITS_OFFSET | \
> > +	1ULL << REFCNT_BITS_OFFSET)
> > +
> > +/* id bits offset in packed ring desc higher 64bits */
> > +#define ID_BITS_OFFSET ((offsetof(struct vring_packed_desc, id) - \
> > +	offsetof(struct vring_packed_desc, len)) * BYTE_SIZE)
> > +
> > +/* net hdr short size mask */
> > +#define NET_HDR_MASK 0x3F
> > +
> >  #define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \
> >  	sizeof(struct vring_packed_desc))
> >  #define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1)
> > @@ -47,6 +65,48 @@
> >  	for (iter = val; iter < num; iter++)
> >  #endif
> >
> > +static inline void
> > +virtio_xmit_cleanup_packed_vec(struct virtqueue *vq)
> > +{
> > +	struct vring_packed_desc *desc = vq->vq_packed.ring.desc;
> > +	struct vq_desc_extra *dxp;
> > +	uint16_t used_idx, id, curr_id, free_cnt = 0;
> > +	uint16_t size = vq->vq_nentries;
> > +	struct rte_mbuf *mbufs[size];
> > +	uint16_t nb_mbuf = 0, i;
> > +
> > +	used_idx = vq->vq_used_cons_idx;
> > +
> > +	if (!desc_is_used(&desc[used_idx], vq))
> > +		return;
> > +
> > +	id = desc[used_idx].id;
> > +
> > +	do {
> > +		curr_id = used_idx;
> > +		dxp = &vq->vq_descx[used_idx];
> > +		used_idx += dxp->ndescs;
> > +		free_cnt += dxp->ndescs;
> > +
> > +		if (dxp->cookie != NULL) {
> > +			mbufs[nb_mbuf] = dxp->cookie;
> > +			dxp->cookie = NULL;
> > +			nb_mbuf++;
> > +		}
> > +
> > +		if (used_idx >= size) {
> > +			used_idx -= size;
> > +			vq->vq_packed.used_wrap_counter ^= 1;
> > +		}
> > +	} while (curr_id != id);
> > +
> > +	for (i = 0; i < nb_mbuf; i++)
> > +		rte_pktmbuf_free(mbufs[i]);
> > +
> > +	vq->vq_used_cons_idx = used_idx;
> > +	vq->vq_free_cnt += free_cnt;
> > +}
> > +
> 
> 
> I think you can re-use the inlined non-vectorized cleanup function here.
> Or use your implementation in non-vectorized path.
> BTW, do you know we have to pass the num argument in non-vectorized
> case? I'm not sure to remember.
> 

Maxime,
This is simple version of xmit clean up function. It is based on the concept that backend will update used id in burst which also match frontend's requirement.
I just found original version work better in loopback case. Will adapt it in next version. 

Thanks,
Marvin

> Maxime



More information about the dev mailing list