[dpdk-dev] [PATCH v2 3/4] vhost: avoid deadlock on async register

Hu, Jiayu jiayu.hu at intel.com
Mon Apr 19 06:10:14 CEST 2021


Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Friday, April 16, 2021 4:34 PM
> To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org
> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Wang, Yinan
> <yinan.wang at intel.com>; Pai G, Sunil <sunil.pai.g at intel.com>; Jiang, Cheng1
> <cheng1.jiang at intel.com>
> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
> 
> 
> 
> On 4/16/21 10:18 AM, Hu, Jiayu wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> >> Sent: Friday, April 16, 2021 2:35 PM
> >> To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org
> >> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Wang, Yinan
> >> <yinan.wang at intel.com>; Pai G, Sunil <sunil.pai.g at intel.com>; Jiang,
> Cheng1
> >> <cheng1.jiang at intel.com>
> >> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
> >>
> >> Hi Jiayu,
> >>
> >> On 4/16/21 4:19 AM, Hu, Jiayu wrote:
> >>> Hi Maxime,
> >>>>>> On 4/14/21 3:40 AM, Hu, Jiayu wrote:
> >>>>>>> Hi Maxime,
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> >>>>>>>> Sent: Tuesday, April 13, 2021 7:33 PM
> >>>>>>>> To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org
> >>>>>>>> Cc: Xia, Chenbo <chenbo.xia at intel.com>; Wang, Yinan
> >>>>>>>> <yinan.wang at intel.com>; Pai G, Sunil <sunil.pai.g at intel.com>;
> Jiang,
> >>>>>> Cheng1
> >>>>>>>> <cheng1.jiang at intel.com>
> >>>>>>>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 4/2/21 3:04 PM, Jiayu Hu wrote:
> >>>>>>>>> Users can register async copy device in vring_state_changed(),
> >>>>>>>>> when vhost queue is enabled. However, a deadlock occurs inside
> >>>>>>>>> rte_vhost_async_channel_register(), if
> >>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES
> >>>>>>>>> is not supported, as vhost_user_msg_handler() takes vq-
> >>> access_lock
> >>>>>>>>> before calling vhost_user_set_vring_kick().
> >>>>>>>>>
> >>>>>>>>> This patch avoids async register deadlock by removing calling
> >>>>>>>>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe
> >>>>>>>>> as vhost_user_msg_handler() will call vring_state_changed()
> >> anyway.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
> >>>>>>>>> ---
> >>>>>>>>>  lib/librte_vhost/vhost_user.c | 3 ---
> >>>>>>>>>  1 file changed, 3 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>> b/lib/librte_vhost/vhost_user.c
> >>>>>>>>> index 44c0452..8f0eba6 100644
> >>>>>>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>>>>>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct
> >> virtio_net
> >>>>>>>> **pdev, struct VhostUserMsg *msg,
> >>>>>>>>>  	 */
> >>>>>>>>>  	if (!(dev->features & (1ULL <<
> >>>>>>>> VHOST_USER_F_PROTOCOL_FEATURES))) {
> >>>>>>>>>  		vq->enabled = true;
> >>>>>>>>> -		if (dev->notify_ops->vring_state_changed)
> >>>>>>>>> -			dev->notify_ops->vring_state_changed(
> >>>>>>>>> -				dev->vid, file.index, 1);
> >>>>>>>>>  	}
> >>>>>>>>>
> >>>>>>>>>  	if (vq->ready) {
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> As replied earlier on v1, I agree the call to vring_state_changed
> here
> >>>>>>>> is not needed. But it might not be enough, there are other cases
> >> where
> >>>>>>>> you could have issues.
> >>>>>>>
> >>>>>>> vhost_user_notify_queue_state() can be called in three cases:
> >>>>>>> 1. when vq ready status changes, vhost_user_msg_handler() calls it
> to
> >>>>>> notify
> >>>>>>> backend. But vhost_user_msg_handler() doesn't take lock before
> >> calling
> >>>> it.
> >>>>>>> So in this case, no deadlock occurs in async register.
> >>>>>>>
> >>>>>>> 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify
> >>>> backend
> >>>>>>> vq is not enabled. Although vhost_user_set_vring_call() is protected
> by
> >>>> lock,
> >>>>>>> async register is called only if vq is enabled, so async register will not
> >> be
> >>>>>> called
> >>>>>>> in this case.
> >>>>>>>
> >>>>>>> 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify
> >>>> backend
> >>>>>>> vq is not enabled. Same as #2, async register is called only when vq
> is
> >>>>>> enabled.
> >>>>>>> Even if vhost_user_set_vring_kick() is protected by lock, there is no
> >>>>>> deadlock in
> >>>>>>> async register, as it will not be called in this case.
> >>>>>>>
> >>>>>>> In summary,  I think there is no deadlock issue in async register if we
> >>>>>>> can remove calling vring_state_change() in
> >> vhost_user_set_vring_kick().
> >>>>>>
> >>>>>>
> >>>>>> But unregister one could be called in theory no? Otherwise it would
> >> look
> >>>>>> unbalanced. At least on disabled notification, the app should make
> sure
> >>>>>> the DMA transfers to and from the vring are stopped before it
> returns
> >>>>>> from the callabck. Otherwise it could lead to undefined behavior.
> >>>>>
> >>>>> Right, users need to call unregister, but we cannot remove calling
> >>>>> vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to
> >>>>> avoid deadlock, we recommended users to call async unregister in
> >>>>> destroy_device(), instead of on vring disabled notification. Does it
> >>>>> make sense to you?
> >>>>
> >>>> Calling async unregister in destroy device is fine by me. But I'm more
> >>>> concerned about DMA transations not being stopped when the ring
> >> becomes
> >>>> disabled.
> >>> If ring becomes disabled, no more new pkts go into async data path, as
> >>> virtio_dev_rx_async_submit() returns if vq->enabled is false.
> >>>
> >>>>
> >>>> I cannot say if you are doing it right, because the vhost example does
> >>>> not implement the vring_state_changed callback.
> >>>> It is not a problem with the sync datapath as we have the lock
> >>>> protection + enabled variable that prevents to process the rings when it
> >>>> gets stopped.
> >>>> But for the async path, if you have programmed DMA transfers, you
> need
> >>>> to rely on the vring_state_change to block the control path while the
> >>>> transfers are cancelled or done.
> >>>
> >>> I am not sure if I understand your concern correctly, but for async data
> >> path,
> >>> enable variable can prevent it from enqueue new pkts when ring is
> >> disabled, as
> >>> virtio_dev_rx_async_submit() check enable variable before processing
> ring;
> >>> in addition, lock protection can stop async data path as
> >> virtio_dev_rx_async_submit()
> >>> acquires lock too. For example, in the case that front-end updates kickfd,
> >>> set_vring_kick() will notify backend that vring is stopped by calling
> >>> vhost_user_notify_queue_state(). In this case, sync data path will stop as
> a
> >> result of
> >>> lock protection, and I think it's the same for async data path, as it
> acquires
> >> lock before
> >>> processing ring.
> >>
> >>
> >> What about memory hotplug happening while the DMA transfers are on-
> >> going
> >> for example?
> >>
> >> In this case, the lock is enough for sync datapath, but is not for async
> >> one.
> >>
> >> If a VHOST_USER_SET_MEM_TABLE request is received while
> >> virtio_dev_rx_async_submit(), it will be blocked until
> >> virtio_dev_rx_async_submit() returns but the DMA transfers may not be
> >> finished yet.
> >> When unblocked the control thread will call vhost_user_set_mem_table(),
> >> which will mnumap the current memory regions before mmaping the new
> >> ones while DMA transfers are on-going.
> >>
> >> To fix this, we need to call the vring_state_changed callback for all
> >> the virtqueues with disable state. in your app, you need to stop DMA
> >> transfers when disabled state notification happens, or block the
> >> callback while the transfer is done.
> >
> > Let me summarize the problem:
> > when frontend memory is hot-plug, host application needs to stop
> > DMA transfer for all virtqueues, which is done by emptying in-flight
> > DMA copies and unregister async inside vring_state_changed().
> 
> I don't think unregistering async channels is mandatory.
> On disable notification, the callback has to block until all on-going
> DMA transfers are done. New transfers won't be schedules since
> virtio_dev_rx_async_submit() will be blocked by the lock while the
> memory regions update is being done.

Yes, unregister is not a must, but the problem is that
rte_vhost_poll_enqueue_completed() takes lock too.
In VM memory hot-plug case, emptying in-flight pkts in
vring_state_changed() may be OK, as it is called by
vhost_user_msg_handler() (please correct me if I am wrong),
which doesn't take lock. But if it is called inside set_vring_kick()/_call(),
a deadlock occurs in rte_vhost_poll_enqueue_completed(), even if user
app doesn't call async unregister API.

> 
> > But
> > vhost library takes lock before entering vring_state_changed(), and
> > async unregister and rte_vhost_poll_enqueue_completed() both
> > acquire lock, so deadlock occurs inside rte_vhost_poll_enqueue_
> > completed() or async unregister. (please correct me if am wrong)
> >
> > There may be two possible solutions:
> > 1. remove lock before calling vring_state_change() in vhost library;
> 
> I don't think so. If you release the lock in vring_state_change(), it
> will enable the app to enqueue/dequeue descriptors in the middle of
> handling the request.

Sorry, I didn't express my opinion clearly. I mean remove the following the
code in set_vring_kick() and set_vring_call():
-       if (vq->ready) {
-               vq->ready = false;
-               vhost_user_notify_queue_state(dev, file.index, 0);
-       }

IMHO, in the both cases, the purpose of calling vhost_user_notify_queue_state()
is to tell backend vring is not ready to use as a result of callfd/kickfd update;
after that, vhost_user_msg_handler() calls it again to notify backend of vring
ready to process. But there is lock protection before entering
set_vring_kick()/_call(), so data path threads will not process vring during updating
callfd/kickfd anyway. After updating callfd/kickfd, I think data path threads are already
safe to start to process vring, so there is no need for a ready notification in
vhost_user_msg_hander().

BTW, lock protection for vhost_user_notify_queue_state() is only in
set_vring_kick()/_call(), but not in vhost_msg_handler().

Thanks,
Jiayu
> 
> > 2. provide a new callback without acquiring lock for DMA accelerated
> > case. The callback is used to notify backend that you need to stop
> > DMA transfer.
> 
> No, in your app just wait for DMA transfers to be finished before
> returning from the callback.
> 
> Please implement vring_state_changed callback in Vhost example, it is
> now mandatory with async datapath, and it will help to have an example
> on what real applications should do.
> 
> Thanks,
> Maxime
> 
> > Thanks,
> > Jiayu
> >



More information about the dev mailing list