[dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for split ring

Hu, Jiayu jiayu.hu at intel.com
Fri Jul 16 15:45:07 CEST 2021



> -----Original Message-----
> From: David Marchand <david.marchand at redhat.com>
> Sent: Friday, July 16, 2021 4:15 PM
> To: Hu, Jiayu <jiayu.hu at intel.com>
> Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; Ma, WenwuX
> <wenwux.ma at intel.com>; dev at dpdk.org; Xia, Chenbo
> <chenbo.xia at intel.com>; Jiang, Cheng1 <cheng1.jiang at intel.com>; Wang,
> YuanX <yuanx.wang at intel.com>
> Subject: Re: [dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for
> split ring
> 
> On Wed, Jul 14, 2021 at 8:50 AM Hu, Jiayu <jiayu.hu at intel.com> wrote:
> > > Are we ensuring packets are not reordered with this way of working?
> >
> > There is a threshold can be set by users. If set it to 0, which
> > presents all packet copies assigned to the DMA, the packets sent from
> > the guest will not be reordered.
> 
> - I find the rte_vhost_async_channel_register() signature with a bitfield quite
> ugly.
> We are writing sw, this is not mapped to hw stuff... but ok this is a different
> topic.

I have reworked the structure. Here is the link:
http://patches.dpdk.org/project/dpdk/patch/1626465089-17052-3-git-send-email-jiayu.hu@intel.com/

> 
> 
> - I don't like this threshold, this is too low level and most users will only see
> the shiny aspect "better performance" without understanding the
> consequences.
> By default, it leaves the door open to a _bad_ behavior, that is packet
> reordering.
> At a very minimum, strongly recommend to use 0 in the API.

That's a good point. But there are some reasons of open this value to users:
- large packets will block small packets, like control packets of TCP.
- dma efficiency. We usually see 20~30% drops because of offloading 64B copies to
dma engine.
- the threshold is not only related to hardware, but also application. The value decides
which copies are assigned to which worker, the CPU or the DMA. As async vhost works
in an asynchronous way, the threshold value decides how many works can be done in
parallel. It's not only about what DMA engine and what platform we use, but also what
computation the CPU has been assigned. Different users will have different values.

I totally understand the worry about reordering. But simple iperf tests show positive
results with setting threshold in our lab. We need more careful tests before modifying
it, IMHO.

Thanks,
Jiayu
> 
> 
> 
> --
> David Marchand



More information about the dev mailing list