[PATCH] vhost: add VDUSE virtqueue ready state polling workaround
Maxime Coquelin
maxime.coquelin at redhat.com
Tue Sep 16 10:47:04 CEST 2025
On 9/15/25 11:42 AM, David Marchand wrote:
> On Thu, 11 Sept 2025 at 10:36, Maxime Coquelin
> <maxime.coquelin at redhat.com> wrote:
>>
>> Add workaround to poll virtqueue ready states before starting device
>> when VIRTIO_DEVICE_STATUS_DRIVER_OK is set in vduse_events_handler().
>>
>> For each virtqueue, poll using VDUSE_VQ_GET_INFO ioctl to check
>> vq_info->ready state with configurable retry limit. This addresses
>> timing issues where device start was attempted before all virtqueues
>> were properly initialized and ready.
>>
>> A notification mechanism will be introduced in the next version of
>> the VDUSE uAPI. When it lands, we would only apply this workaround
>> when the kernel does not support it.
>>
>> Fixes: a9120db8b98b ("vhost: add VDUSE device startup")
>> Cc: stable at dpdk.org
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>> ---
>> lib/vhost/vduse.c | 62 +++++++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 60 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c
>> index 9de7f04a4f..5a6025d702 100644
>> --- a/lib/vhost/vduse.c
>> +++ b/lib/vhost/vduse.c
>> @@ -272,6 +272,56 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index)
>> vq->last_avail_idx = 0;
>> }
>>
>> +
>
> Nit: no need for double empty lines.
>
>> +/*
>> + * Tests show that it succeeds at the first retry at worst,
>
> it?
Changing to:
"Tests show that virtqueues get ready at the first retry at worst..."
>
>> + * but let's be on the safe side and allow more retries.
>> + */
>> +#define VDUSE_VQ_READY_POLL_MAX_RETRIES 100
>> +
>> +static int
>> +vduse_wait_for_virtqueues_ready(struct virtio_net *dev)
>> +{
>> + struct vduse_vq_info vq_info;
>> + unsigned int i;
>> + int ret;
>> +
>> + for (i = 0; i < dev->nr_vring; i++) {
>> + int retry_count = 0;
>> +
>> + while (retry_count < VDUSE_VQ_READY_POLL_MAX_RETRIES) {
>> + vq_info.index = i;
>
> It is not clear which part of the vduse_vq_info structure is r/o, r/w
> or w/o in uapi header
> I see that vduse_vring_setup() does nothing more than setting index.
> I am probably paranoid but do we need an explicit reset of the whole
> vq_info on retry?
>
> Moving the definition of vq_info in this loop (right before setting
> vq_info.index) seems better on that topic.
>
The Kernel side only look for the index field (for now at least), but I
agree that could change, so zeroing vq_info should be done.
I will also send a separate patch for vduse_vring_setup().
>> + ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_GET_INFO, &vq_info);
>> + if (ret) {
>> + VHOST_CONFIG_LOG(dev->ifname, ERR,
>> + "Failed to get VQ %u info while polling ready state: %s",
>> + i, strerror(errno));
>> + return -1;
>> + }
>> +
>> + if (vq_info.ready) {
>> + VHOST_CONFIG_LOG(dev->ifname, DEBUG,
>> + "VQ %u is ready after %u retries", i, retry_count);
>> + break;
>> + }
>> +
>> + retry_count++;
>> + /* Small delay between retries */
>
> I would remove this Lapalissade comment.
>
>
>> + usleep(1000);
>> + }
>> +
>> + if (retry_count >= VDUSE_VQ_READY_POLL_MAX_RETRIES) {
>> + VHOST_CONFIG_LOG(dev->ifname, ERR,
>> + "VQ %u ready state polling timeout after %u retries",
>> + i, VDUSE_VQ_READY_POLL_MAX_RETRIES);
>> + return -1;
>> + }
>> + }
>> +
>> + VHOST_CONFIG_LOG(dev->ifname, INFO, "All virtqueues are ready after polling");
>> + return 0;
>> +}
>> +
>> static void
>> vduse_device_start(struct virtio_net *dev, bool reconnect)
>> {
>> @@ -414,10 +464,18 @@ vduse_events_handler(int fd, void *arg, int *close __rte_unused)
>> }
>>
>> if ((old_status ^ dev->status) & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
>> - if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK)
>> + if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK) {
>> + /* Poll virtqueues ready states before starting device */
>> + ret = vduse_wait_for_virtqueues_ready(dev);
>> + if (ret < 0) {
>> + VHOST_CONFIG_LOG(dev->ifname, ERR,
>> + "Failed to wait for virtqueues ready, aborting device start");
>> + return;
>> + }
>> vduse_device_start(dev, false);
>> - else
>> + } else {
>> vduse_device_stop(dev);
>> + }
>> }
>>
>> VHOST_CONFIG_LOG(dev->ifname, INFO, "Request %s (%u) handled successfully",
>> --
>> 2.51.0
>>
>
> Aside from those nits, it looks an acceptable workaround for now.
> Reviewed-by: David Marchand <david.marchand at redhat.com>
>
>
More information about the stable
mailing list