[dpdk-dev] [PATCH 3/4] vhost: avoid deadlock on async register

Maxime Coquelin maxime.coquelin at redhat.com
Tue Apr 13 11:37:58 CEST 2021



On 3/30/21 3:20 AM, Hu, Jiayu wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
>> Sent: Monday, March 29, 2021 11:19 PM
>> To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org
>> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Wang, Yinan
>> <yinan.wang at intel.com>; Jiang, Cheng1 <cheng1.jiang at intel.com>; Pai G,
>> Sunil <sunil.pai.g at intel.com>
>> Subject: Re: [PATCH 3/4] vhost: avoid deadlock on async register
>>
>>
>>
>> On 3/17/21 1:56 PM, Jiayu Hu wrote:
>>> Users register async copy device when vhost queue is enabled.
>>> However, if VHOST_USER_F_PROTOCOL_FEATURES is not supported,
>>> a deadlock occurs inside rte_vhost_async_channel_register(),
>>> as vhost_user_msg_handler() already takes vq->access_lock
>>> before processing VHOST_USER_SET_VRING_KICK message.
>>>
>>> This patch removes calling vring_state_changed() in
>>> vhost_user_set_vring_kick() to avoid deadlock on async register.
>>>
>>> Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
>>> ---
>>>  lib/librte_vhost/vhost_user.c | 3 ---
>>>  1 file changed, 3 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>> index 399675c..a319c1c 100644
>>> --- a/lib/librte_vhost/vhost_user.c
>>> +++ b/lib/librte_vhost/vhost_user.c
>>> @@ -1919,9 +1919,6 @@ vhost_user_set_vring_kick(struct virtio_net
>> **pdev, struct VhostUserMsg *msg,
>>>  	 */
>>>  	if (!(dev->features & (1ULL <<
>> VHOST_USER_F_PROTOCOL_FEATURES))) {
>>>  		vq->enabled = 1;
>>> -		if (dev->notify_ops->vring_state_changed)
>>> -			dev->notify_ops->vring_state_changed(
>>> -				dev->vid, file.index, 1);
>>
>> That looks very wrong, as:
>> 1. The apps want to receive this notification. It looks like breaking
>> existing apps in order to support the experimental async datapath. E.g.
>> OVS needs it to start polling the queues when protocol features is not
>> negotiated.
> 
> IMHO, if protocol feature is not negotiated, vring_state_chaned will also
> be called in vhost_user_msg_handler. In the case you mentioned,
> vq->enabled is set to true in set_vring_kick, and in vhost_user_msg_handler,
> "cur_ready != (vq && vq->ready)" is true, as vq->ready is false when init. So
> vhost_user_msg_handler will call vhost_user_notify_queue_state, which
> calls set_vring_kick inside.

OK, I agree, we can drop this one.
But it is not enough as vhost_user_notify_queue_state() is called at
several place with the lock taken.

> In addition, calling vring_state_changed in set_vring_kick is protected by lock,
> but it's not in in vhost_user_msg_handler. It looks confusing to me. Is there
> any special reason for this design?

I think we need the lock help every time the callback is called, to
avoid the case an application calls a Vhost API that would modify the vq
struct. We could get undefined behavior if it happened.

> 
>>
>> 2. The fix in your case seems to indicate that your app's
>> vring_state_changed callback called rte_vhost_async_channel_register.
>> And your fix consists in no more calling the callback, and so no more
>> calling rte_vhost_async_channel_register?
> 
> rte_vhost_async_channel_register is recommended to call in
> vring_state_changed, and vring_state_changed will be called
> by vhost_user_msg_handler.

You might want to schedule a thread to call channel registration. Maybe
using rte_set_alarm?

Regards,
Maxime

> 
> Thanks,
> Jiayu
>>
>>>  	}
>>>
>>>  	if (vq->ready) {
>>>
> 



More information about the dev mailing list