[dpdk-dev] [PATCH v2] vhost: add virtio configuration space messages

Maxime Coquelin maxime.coquelin at redhat.com
Wed Mar 28 12:56:18 CEST 2018



On 03/28/2018 12:23 PM, Liu, Changpeng wrote:
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>> Sent: Wednesday, March 28, 2018 6:11 PM
>> To: Liu, Changpeng <changpeng.liu at intel.com>; Kulasek, TomaszX
>> <tomaszx.kulasek at intel.com>; yliu at fridaylinux.org
>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>> <james.r.harris at intel.com>; Wodkowski, PawelX
>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Tan, Jianfeng
>> <jianfeng.tan at intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio configuration space
>> messages
>>
>>
>>
>> On 03/28/2018 12:03 PM, Liu, Changpeng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>>> Sent: Wednesday, March 28, 2018 5:58 PM
>>>> To: Liu, Changpeng <changpeng.liu at intel.com>; Kulasek, TomaszX
>>>> <tomaszx.kulasek at intel.com>; yliu at fridaylinux.org
>>>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>>>> <james.r.harris at intel.com>; Wodkowski, PawelX
>>>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Tan, Jianfeng
>>>> <jianfeng.tan at intel.com>
>>>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio configuration space
>>>> messages
>>>>
>>>>
>>>>
>>>> On 03/28/2018 11:50 AM, Liu, Changpeng wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>>>>> Sent: Wednesday, March 28, 2018 5:12 PM
>>>>>> To: Kulasek, TomaszX <tomaszx.kulasek at intel.com>; yliu at fridaylinux.org
>>>>>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>>>>>> <james.r.harris at intel.com>; Wodkowski, PawelX
>>>>>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Liu, Changpeng
>>>>>> <changpeng.liu at intel.com>; Tan, Jianfeng <jianfeng.tan at intel.com>
>>>>>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio configuration space
>>>>>> messages
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 03/27/2018 05:35 PM, Tomasz Kulasek wrote:
>>>>>>> This patch adds new vhost user messages GET_CONFIG and SET_CONFIG
>>>> used
>>>>>>> for get/set virtio device's configuration space.
>>>>>>>
>>>>>>> Signed-off-by: Changpeng Liu <changpeng.liu at intel.com>
>>>>>>> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek at intel.com>
>>>>>>> ---
>>>>>>> Changes in v2:
>>>>>>>      - code cleanup
>>>>>>>
>>>>>>>      lib/librte_vhost/rte_vhost.h  |  4 ++++
>>>>>>>      lib/librte_vhost/vhost_user.c | 22 ++++++++++++++++++++++
>>>>>>>      lib/librte_vhost/vhost_user.h | 16 ++++++++++++++++
>>>>>>>      3 files changed, 42 insertions(+)
>>>>>>>
>>>>>>> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
>>>>>>> index d332069..fe30518 100644
>>>>>>> --- a/lib/librte_vhost/rte_vhost.h
>>>>>>> +++ b/lib/librte_vhost/rte_vhost.h
>>>>>>> @@ -84,6 +84,10 @@ struct vhost_device_ops {
>>>>>>>      	int (*new_connection)(int vid);
>>>>>>>      	void (*destroy_connection)(int vid);
>>>>>>>
>>>>>>> +	int (*get_config)(int vid, uint8_t *config, uint32_t config_len);
>>>>>>> +	int (*set_config)(int vid, uint8_t *config, uint32_t offset,
>>>>>>> +			uint32_t len, uint32_t flags);
>>>>>>> +
>>>>>>>      	void *reserved[2]; /**< Reserved for future extension */
>>>>>>
>>>>>> You are breaking the ABI, as you grow the size of the ops struct.
>>>>>>
>>>>>> Also, I'm wondering if we shouldn't have a different ops for external
>>>>>> backends. Here these ops are more intended to the application, we could
>>>>>> have a specific ops struct for external backends IMHO.
>>>>>>
>>>>>>>      };
>>>>>>>
>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>>>> index 90ed211..0ed6a5a 100644
>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>> @@ -50,6 +50,8 @@ static const char
>>>> *vhost_message_str[VHOST_USER_MAX]
>>>>>> = {
>>>>>>>      	[VHOST_USER_NET_SET_MTU]  = "VHOST_USER_NET_SET_MTU",
>>>>>>>      	[VHOST_USER_SET_SLAVE_REQ_FD]  =
>>>>>> "VHOST_USER_SET_SLAVE_REQ_FD",
>>>>>>>      	[VHOST_USER_IOTLB_MSG]  = "VHOST_USER_IOTLB_MSG",
>>>>>>> +	[VHOST_USER_GET_CONFIG] = "VHOST_USER_GET_CONFIG",
>>>>>>> +	[VHOST_USER_SET_CONFIG] = "VHOST_USER_SET_CONFIG",
>>>>>>>      };
>>>>>>>
>>>>>>>      static uint64_t
>>>>>>> @@ -1355,6 +1357,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>      	 * would cause a dead lock.
>>>>>>>      	 */
>>>>>>>      	switch (msg.request.master) {
>>>>>>> +	case VHOST_USER_SET_CONFIG:
>>>>>>
>>>>>> It seems VHOST_USER_GET_CONFIG is missing here.
>>>>>>
>>>>>>>      	case VHOST_USER_SET_FEATURES:
>>>>>>>      	case VHOST_USER_SET_PROTOCOL_FEATURES:
>>>>>>>      	case VHOST_USER_SET_OWNER:
>>>>>>> @@ -1380,6 +1383,25 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>      	}
>>>>>>>
>>>>>>>      	switch (msg.request.master) {
>>>>>>> +	case VHOST_USER_GET_CONFIG:
>>>>>>> +		if (dev->notify_ops->get_config(dev->vid,
>>>>>> Please check ->get_config is set before calling it.
>>>>>>
>>>>>>> +				msg.payload.config.region,
>>>>>>> +				msg.payload.config.size) != 0) {
>>>>>>> +			msg.size = sizeof(uint64_t);
>>>>>>> +		}
>>>>>>> +		send_vhost_reply(fd, &msg);
>>>>>>> +		break;
>>>>>>> +	case VHOST_USER_SET_CONFIG:
>>>>>>> +		if ((dev->notify_ops->set_config(dev->vid,
>>>>>> Ditto.
>>>>>>
>>>>>>> +				msg.payload.config.region,
>>>>>>> +				msg.payload.config.offset,
>>>>>>> +				msg.payload.config.size,
>>>>>>> +				msg.payload.config.flags)) != 0) {
>>>>>>> +			ret = 1;
>>>>>>> +		} else {
>>>>>>> +			ret = 0;
>>>>>>> +		}
>>>>>>
>>>>>> ret = dev->notify_ops->set_config instead?
>>>>>>> +		break;
>>>>>>>      	case VHOST_USER_GET_FEATURES:
>>>>>>>      		msg.payload.u64 = vhost_user_get_features(dev);
>>>>>>>      		msg.size = sizeof(msg.payload.u64);
>>>>>>> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
>>>>>>> index d4bd604..25cc026 100644
>>>>>>> --- a/lib/librte_vhost/vhost_user.h
>>>>>>> +++ b/lib/librte_vhost/vhost_user.h
>>>>>>> @@ -14,6 +14,11 @@
>>>>>>>
>>>>>>>      #define VHOST_MEMORY_MAX_NREGIONS 8
>>>>>>>
>>>>>>> +/*
>>>>>>> + * Maximum size of virtio device config space
>>>>>>> + */
>>>>>>> +#define VHOST_USER_MAX_CONFIG_SIZE 256
>>>>>>> +
>>>>>>>      #define VHOST_USER_PROTOCOL_F_MQ	0
>>>>>>>      #define VHOST_USER_PROTOCOL_F_LOG_SHMFD	1
>>>>>>>      #define VHOST_USER_PROTOCOL_F_RARP	2
>>>>>>
>>>>>> Shouldn't there be a protocol feature associated to these new messages?
>>>>>> Else how QEMU knows the backend supports it or not?
>>>>>>
>>>>>> I looked at QEMU code and indeed no protocol feature associated, that's
>>>>>> strange...
>>>>> Nice to have, for now not all the QEMU host driver need to get this
>>>> configuration space from slave backend
>>>>> when getting start. This message can be used for migration of vhost-user
>>>> devices.
>>>>
>>>> So if QEMU sends this message but the DPDK version does not support it
>>>> yet, vhost_user_msg_handler() will return an error ("vhost read
>>>> incorrect message") and the socket will be closed.
>>>>
>>>> How do we overcome this? I think we really need a spec update ASAP,
>>>> before QEMU v2.12 is out (-rc1 already).
>>>>
>>>> Do you have time to take care of this?
>>> For now there are no other users except us care about this message, :), it's no
>> hurry.
>>> I can take this after QEMU 2.12 release adding a new protocol feature bit.
>>
>> Are you sure?
>> If I understand the code correctly, as the guest writes in config regs
>> of a virtio-blk device, .set_config callback will be called.
> Exactly.
>>
>> If you have a vhost-user backend, it will receive the SET_CONFIG
>> request, no?
> For now this only enabled for QEMU vhost-user-blk driver, QEMU virtio-blk driver didn't have such issue.

Right.
But it will be really painful to manage for example for cross-version
live migration. Or when you'll want to use QEMU v2.13+ with a DPDK
v18.05 backend, the protocol feature won't be negotiated.

Really, this is important to get it right at the beginning.

Thanks,
Maxime
>>
>> Cheers,
>> Maxime
>>
>>>>
>>>> Thanks,
>>>> Maxime


More information about the dev mailing list