[dpdk-dev] [PATCH v1 2/2] vhost: introduce async enqueue for split ring
Fu, Patrick
patrick.fu at intel.com
Thu Jun 18 13:36:43 CEST 2020
Hi,
> -----Original Message-----
> From: Liu, Yong <yong.liu at intel.com>
> Sent: Thursday, June 18, 2020 2:57 PM
> To: Fu, Patrick <patrick.fu at intel.com>
> Cc: Fu, Patrick <patrick.fu at intel.com>; Jiang, Cheng1
> <cheng1.jiang at intel.com>; Liang, Cunming <cunming.liang at intel.com>;
> dev at dpdk.org; maxime.coquelin at redhat.com; Xia, Chenbo
> <chenbo.xia at intel.com>; Wang, Zhihong <zhihong.wang at intel.com>; Ye,
> Xiaolong <xiaolong.ye at intel.com>
> Subject: RE: [dpdk-dev] [PATCH v1 2/2] vhost: introduce async enqueue for
> split ring
>
> Thanks, Patrick. Some comments are inline.
>
> >
> > From: Patrick <patrick.fu at intel.com>
> >
> > This patch implement async enqueue data path for split ring.
> >
> > Signed-off-by: Patrick <patrick.fu at intel.com>
> > ---
> > lib/librte_vhost/rte_vhost_async.h | 38 +++
> > lib/librte_vhost/virtio_net.c | 538
> > ++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 574 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_vhost_async.h
> > b/lib/librte_vhost/rte_vhost_async.h
> > index 82f2ebe..efcba0a 100644
> > --- a/lib/librte_vhost/rte_vhost_async.h
> > +++ b/lib/librte_vhost/rte_vhost_async.h
> > +
> > +static __rte_always_inline int
> > +async_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > + struct rte_mbuf *m, struct buf_vector *buf_vec,
> > + uint16_t nr_vec, uint16_t num_buffers,
> > + struct iovec *src_iovec, struct iovec *dst_iovec,
> > + struct iov_it *src_it, struct iov_it *dst_it) {
>
> There're too much arguments in this function, please check whether it will
> impact performance.
>
I guess src_iovec & dst_iovec could be removed from the parameter list.
> > + uint32_t vec_idx = 0;
> > + uint32_t mbuf_offset, mbuf_avail;
> > + uint32_t buf_offset, buf_avail;
> > + uint64_t buf_addr, buf_iova, buf_len;
> > + uint32_t cpy_len, cpy_threshold;
> > + uint64_t hdr_addr;
> > + struct rte_mbuf *hdr_mbuf;
> > + struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
> > + struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
> > + int error = 0;
> > +
> > + uint32_t tlen = 0;
> > + int tvec_idx = 0;
> > + void *hpa;
> > +
> > + if (unlikely(m == NULL)) {
> > + error = -1;
> > + goto out;
> > + }
> > +
> > + cpy_threshold = vq->async_threshold;
> > +
> > + buf_addr = buf_vec[vec_idx].buf_addr;
> > + buf_iova = buf_vec[vec_idx].buf_iova;
> > + buf_len = buf_vec[vec_idx].buf_len;
> > +
> > + if (unlikely(buf_len < dev->vhost_hlen && nr_vec <= 1)) {
> > + error = -1;
> > + goto out;
> > + }
> > +
> > + hdr_mbuf = m;
> > + hdr_addr = buf_addr;
> > + if (unlikely(buf_len < dev->vhost_hlen))
> > + hdr = &tmp_hdr;
> > + else
> > + hdr = (struct virtio_net_hdr_mrg_rxbuf
> > *)(uintptr_t)hdr_addr;
> > +
> > + VHOST_LOG_DATA(DEBUG, "(%d) RX: num merge buffers %d\n",
> > + dev->vid, num_buffers);
> > +
> > + if (unlikely(buf_len < dev->vhost_hlen)) {
> > + buf_offset = dev->vhost_hlen - buf_len;
> > + vec_idx++;
> > + buf_addr = buf_vec[vec_idx].buf_addr;
> > + buf_iova = buf_vec[vec_idx].buf_iova;
> > + buf_len = buf_vec[vec_idx].buf_len;
> > + buf_avail = buf_len - buf_offset;
> > + } else {
> > + buf_offset = dev->vhost_hlen;
> > + buf_avail = buf_len - dev->vhost_hlen;
> > + }
> > +
> > + mbuf_avail = rte_pktmbuf_data_len(m);
> > + mbuf_offset = 0;
> > +
> > + while (mbuf_avail != 0 || m->next != NULL) {
> > + /* done with current buf, get the next one */
> > + if (buf_avail == 0) {
> > + vec_idx++;
> > + if (unlikely(vec_idx >= nr_vec)) {
> > + error = -1;
> > + goto out;
> > + }
> > +
> > + buf_addr = buf_vec[vec_idx].buf_addr;
> > + buf_iova = buf_vec[vec_idx].buf_iova;
> > + buf_len = buf_vec[vec_idx].buf_len;
> > +
> > + buf_offset = 0;
> > + buf_avail = buf_len;
> > + }
> > +
> > + /* done with current mbuf, get the next one */
> > + if (mbuf_avail == 0) {
> > + m = m->next;
> > +
> > + mbuf_offset = 0;
> > + mbuf_avail = rte_pktmbuf_data_len(m);
> > + }
> > +
> > + if (hdr_addr) {
> > + virtio_enqueue_offload(hdr_mbuf, &hdr->hdr);
> > + if (rxvq_is_mergeable(dev))
> > + ASSIGN_UNLESS_EQUAL(hdr->num_buffers,
> > + num_buffers);
> > +
> > + if (unlikely(hdr == &tmp_hdr)) {
> > + copy_vnet_hdr_to_desc(dev, vq, buf_vec,
> > hdr);
> > + } else {
> > + PRINT_PACKET(dev, (uintptr_t)hdr_addr,
> > + dev->vhost_hlen, 0);
> > + vhost_log_cache_write_iova(dev, vq,
> > + buf_vec[0].buf_iova,
> > + dev->vhost_hlen);
> > + }
> > +
> > + hdr_addr = 0;
> > + }
> > +
> > + cpy_len = RTE_MIN(buf_avail, mbuf_avail);
> > +
> > + if (unlikely(cpy_len >= cpy_threshold)) {
> > + hpa = (void *)(uintptr_t)gpa_to_hpa(dev,
> > + buf_iova + buf_offset, cpy_len);
>
> I have one question here. If user has called async copy directly, should vhost
> library still check copy threshold for software fallback?
> If need software fallback, IMHO it will be more suitable to handle it in copy
> device driver.
>
Technically, we can delegate the threshold judgement from vhost to application callbacks.
This will significantly increase the complexity of the callback implementations, which have to maintain
correct ordering between dma copied data and CPU copies data. Meanwhile, it actually doesn't help to
boost performance comparing with current vhost implementation. Considering this threshold is a
generic design, I would still prefer to keep it in vhost.
> IMHO, the cost will be too high for checking and fix virtio header in async
> copy function.
> Since this is async copy datapath, could it possible that eliminate the cost in
> calculation of segmented addresses?
>
Yes, I believe async data path certainly brings new opportunity to apply more optimizations.
However, at current time frame, settling down the overall async framework might be the priority.
Thanks,
Patrick
More information about the dev
mailing list