[dpdk-dev] [PATCH v2 3/3] net/vhost: fix interrupt mode

Maxime Coquelin maxime.coquelin at redhat.com
Wed Jul 29 11:10:24 CEST 2020



On 7/29/20 10:43 AM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin:
>> On 7/28/20 9:03 PM, Matan Azrad wrote:
>>>
>>>
>>> From: Maxime Coquelin:
>>>> At .new_device() time, only the first vring pair is now ready, other
>>>> vrings are consfigured later.
>>>>
>>>> Problem is that when application will setup and enable interrupts,
>>>> only the first queue pair Rx interrupt will be enabled.
>>>>
>>>> This patches fixes the issue by setting the number of max interrupts
>>>> to the number of Rx queues that will be later initialized. Then, as
>>>> soon as a Rx vring is ready, it removes the corresponding
>>>> uninitialized epoll event, and install a new one with the valid FD.
>>>
>>> Doesn't it race condition to the application decision?
>>> App may change the configuration per queue in any time by the app
>> control thread.
>>> The vhost PMD may change it usynchronically from the vhost control
>> thread in the vring state callback.
>>
>> Yes you are right there could be a race here,I'm looking into getting it done in
>> a safe way. Yet it is good to get the confirmation from Intel that it does fix the
>> problem on their side.
> 
> Intel case\l3fw-power is only one case, we can put out the fire now by solving specific case but we need a fix for the global usage.
> 
>> Based on David suggestion, it might be made safe by relying on
>> eth_rxq_intr_enable()/eth_rxq_intr_disable().
> 
> Yes, maybe.
> 
> One more option to adjust the vhost PMD is to do the new_device callback logic when the last queue is ready as was done before,
> By this way, the vhost PMD will use the same timing as before.

Yes, that's one option but that would not be ideal because we would
lose the benefits of what the initial series brings, and it is not a
light change that late in the release.

Trying to fix interrupts, we mainly risk to break interrupt support
which is not widely used. Trying to wait for all queue event to be
received may cause other regressions in all use-cases.

>> If we cannot solve it in a safe way, then we'll have no other choice than
>> reverting partially your patch.
> 
> I'm sure we can find a solution.
> 
> As you probably remember we did the design for the readiness series together (a long discussion in other thread),
> It came to solve an architecture issue in QEMU last versions which might affect vhost lib users in other cases.
> We took it into account that some vhost lib users should do adjustment for the new behavior.
> 
> 
> Matan 
> 
>>
>> Maxime
>>
>>> I already mentioned it in other thread on this topic but didn't get reply.
>>>
>>>> Fixes: 604052ae5395 ("net/vhost: support queue update")
>>>>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>
> 



More information about the dev mailing list