[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

Tetsuya Mukawa mukawa at igel.co.jp
Tue Oct 27 10:25:13 CET 2015


On 2015/10/27 17:39, Yuanhan Liu wrote:
> On Tue, Oct 27, 2015 at 08:24:00AM +0000, Xie, Huawei wrote:
>> On 10/27/2015 3:52 PM, Tetsuya Mukawa wrote:
>>> The patch fixes wrong handling of virtqueue array index when
>>> GET_VRING_BASE message comes.
>>> The vhost backend will receive the message per virtqueue.
>>> Also we should call a destroy callback handler when both RXQ
>>> and TXQ receives the message.
>>>
>>> Signed-off-by: Tetsuya Mukawa <mukawa at igel.co.jp>
>>> ---
>>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++++++++++----------
>>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> index a998ad8..99c075f 100644
>>> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>>>  	struct vhost_vring_state *state)
>>>  {
>>>  	struct virtio_net *dev = get_device(ctx);
>>> +	uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;
>>>  
>>>  	if (dev == NULL)
>>>  		return -1;
>>> -	/* We have to stop the queue (virtio) if it is running. */
>>> -	if (dev->flags & VIRTIO_DEV_RUNNING)
>>> -		notify_ops->destroy_device(dev);
>> Hi Tetsuya:
>> I don't understand why we move it to the end of the function.
>> If we don't tell the application to remove the virtio device from the
> As you stated, he just moved it to the end of the function: it
> still does invoke notfiy_ops->destroy_device() in the end.
>
> And the reason he moved it to the end is he want to invoke the
> callback just when the second GET_VRING_BASE message is received
> for the queue pair. And while thinking twice, it's not necessary,
> as we will do the "flags & VIRTIO_DEV_RUNNING" check first, it
> doesn't matter on which virt queue we invoke the callback.

I thought we had 2 choices.
1. Call the callback handler at first place of this function when 1st
GET_VRING_BASE message comes.
2. Call the callback handler at last place of this function when 2nd
GET_VRING_BASE message comes.

And I chose 2nd, because in the case of 1st choice, before sending 2nd
message, QEMU guess one of queue is still alive, but actually in DPDK
application, it has been closed already.
I thought above inconsistency might cause the issue.
But yes, if we chose 2nd, we may have an issue as Xie said.

>
> 	--yliu
>
>> data plane, then the vhost application is still operating on that
>> device, we shouldn't do anything to the virtio_net device.
>> For this case, as vhost doesn't use kickfd, it will not cause issue, but
>> i think it is best practice firstly to remove it from data plan through
>> destroy_device.
>>
>> I think we could call destroy_device the first time we receive this
>> message. Currently we don't have per queue granularity control to only
>> remove one queue from data plane.
>>
>> I am Okay to only close the kickfd for the specified queue index.
>>
>> Btw, do you meet issue with previous implementation?

Yes, I faced illegal memory access.
For example, if we have RX and TX queues, we will have 2 GET_VRING_BASE
messages when virtio-net device is finalized.
While handling these messages, 'dev->virtqueue[2]' will be accessed,
then will cause illegal access.
(We only have 2 queues, so above will be NULL)
So actually we need to change the function a bit.

Thanks,
Tetsuya


More information about the dev mailing list