|SUCCESS| vhost: fix rte_vhost_vring_call coredump caused by numa_realloc free virtio_net

dpdklab at iol.unh.edu dpdklab at iol.unh.edu
Thu Mar 26 18:25:56 CET 2026


Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/162725

_AI Code Review_

=== Patch Review: series.patch (via Claude) ===
Review date: 2026-03-26

Patch Review for DPDK: vhost NUMA Reallocation Fix

git sha: (not provided in patch)
Author: YuanXin <yuanxin123456 at 126.com>

Subject: vhost: fix rte_vhost_vring_call coredump caused by numa_realloc free
virtio_net

This patch attempts to address a race condition where numa_realloc()
frees and reallocates virtio_net structures while backend poller threads
may still be accessing them.

> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 4bfb13fb98..1312f95cbc 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -681,6 +681,9 @@ numa_realloc(struct virtio_net **pdev, struct vhost_virtqueue **pvq)
>  
>  out_dev_realloc:
>  
> +	if (vq->index > 0)
> +		return;
> +

Does this condition correctly protect against all concurrent access
patterns?  The check only prevents reallocation when vq->index > 0,
but what happens if multiple threads simultaneously process vq index 0?

Can the race condition still occur if two threads both see vq->index == 0
and both proceed past this check?  There appears to be no locking around
this new check to make it atomic with the subsequent reallocation code.

Is the assumption that only vq index 0 should trigger device reallocation
universally correct?  The commit message mentions "it's no necessary to
do dev_realloc for each vq kick" but doesn't explain why index 0 is
special or whether this maintains correctness for all NUMA configurations.

>  	if (dev->flags & VIRTIO_DEV_RUNNING)
>  		return;

According to the commit message, "there is no lock protecting dev
(virtio_net), only vq is protected by lock."  Does this mean the
VIRTIO_DEV_RUNNING check is also racy and insufficient to prevent the
described use-after-free?

The patch description states that backend poller threads can access
virtio_net concurrently with numa_realloc operations.  Does this fix
address the fundamental issue that virtio_net can be freed while still
in use, or does it only reduce the window for the race condition?

Would a more robust solution involve reference counting or RCU-style
protection for the virtio_net structure itself, rather than trying to
limit when reallocation occurs?


More information about the test-report mailing list