[dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for split ring

Maxime Coquelin maxime.coquelin at redhat.com
Fri Jul 16 09:45:50 CEST 2021


Hi,

On 7/16/21 3:10 AM, Hu, Jiayu wrote:
> Hi, Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
>> Sent: Thursday, July 15, 2021 9:18 PM
>> To: Hu, Jiayu <jiayu.hu at intel.com>; Ma, WenwuX <wenwux.ma at intel.com>;
>> dev at dpdk.org
>> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Jiang, Cheng1
>> <cheng1.jiang at intel.com>; Wang, YuanX <yuanx.wang at intel.com>
>> Subject: Re: [PATCH v5 3/4] vhost: support async dequeue for split ring
>>
>>
>>
>> On 7/14/21 8:50 AM, Hu, Jiayu wrote:
>>> Hi Maxime,
>>>
>>> Thanks for your comments. Applies are inline.
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>> Sent: Tuesday, July 13, 2021 10:30 PM
>>>> To: Ma, WenwuX <wenwux.ma at intel.com>; dev at dpdk.org
>>>> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Jiang, Cheng1
>>>> <cheng1.jiang at intel.com>; Hu, Jiayu <jiayu.hu at intel.com>; Wang, YuanX
>>>> <yuanx.wang at intel.com>
>>>> Subject: Re: [PATCH v5 3/4] vhost: support async dequeue for split
>>>> ring
>>>>>  struct async_inflight_info {
>>>>>  	struct rte_mbuf *mbuf;
>>>>> -	uint16_t descs; /* num of descs inflight */
>>>>> +	union {
>>>>> +		uint16_t descs; /* num of descs in-flight */
>>>>> +		struct async_nethdr nethdr;
>>>>> +	};
>>>>>  	uint16_t nr_buffers; /* num of buffers inflight for packed ring */
>>>>> -};
>>>>> +} __rte_cache_aligned;
>>>>
>>>> Does it really need to be cache aligned?
>>>
>>> How about changing to 32-byte align? So a cacheline can hold 2 objects.
>>
>> Or not forcing any alignment at all? Would there really be a performance
>> regression?
>>
>>>>
>>>>>
>>>>>  /**
>>>>>   *  dma channel feature bit definition @@ -193,4 +201,34 @@
>>>>> __rte_experimental  uint16_t rte_vhost_poll_enqueue_completed(int
>>>>> vid, uint16_t queue_id,
>>>>>  		struct rte_mbuf **pkts, uint16_t count);
>>>>>
>>>>> +/**
>>>>> + * This function tries to receive packets from the guest with
>>>>> +offloading
>>>>> + * large copies to the DMA engine. Successfully dequeued packets
>>>>> +are
>>>>> + * transfer completed, either by the CPU or the DMA engine, and
>>>>> +they are
>>>>> + * returned in "pkts". There may be other packets that are sent
>>>>> +from
>>>>> + * the guest but being transferred by the DMA engine, called
>>>>> +in-flight
>>>>> + * packets. The amount of in-flight packets by now is returned in
>>>>> + * "nr_inflight". This function will return in-flight packets only
>>>>> +after
>>>>> + * the DMA engine finishes transferring.
>>>>
>>>> I am not sure to understand that comment. Is it still "in-flight" if
>>>> the DMA transfer is completed?
>>>
>>> "in-flight" means packet copies are submitted to the DMA, but the DMA
>>> hasn't completed copies.
>>>
>>>>
>>>> Are we ensuring packets are not reordered with this way of working?
>>>
>>> There is a threshold can be set by users. If set it to 0, which
>>> presents all packet copies assigned to the DMA, the packets sent from
>>> the guest will not be reordered.
>>
>> Reordering packets is bad in my opinion. We cannot expect the user to know
>> that he should set the threshold to zero to have packets ordered.
>>
>> Maybe we should consider not having threshold, and so have every
>> descriptors handled either by the CPU (sync datapath) or by the DMA (async
>> datapath). Doing so would simplify a lot the code, and would make
>> performance/latency more predictable.
>>
>> I understand that we might not get the best performance for every packet
>> size doing that, but that may be a tradeoff we would make to have the
>> feature maintainable and easily useable by the user.
> 
> I understand and agree in some way. But before changing the existed design
> in async enqueue and dequeue, we need more careful tests, as current design
> is well validated and performance looks good. So I suggest to do it in 21.11.

My understanding was that for enqueue path packets were not reordered,
thinking the used ring was written in order, but it seems I was wrong.

What kind of validation and performance testing has been done? I can
imagine reordering to have a bad impact on L4+ benchmarks.

Let's first fix this for enqueue path, then submit new revision for
dequeue path without packet reordering.

Regards,
Maxime

> Thanks,
> Jiayu
> 



More information about the dev mailing list