[dpdk-dev] [PATCH v2] vhost: add virtio configuration space messages
Maxime Coquelin
maxime.coquelin at redhat.com
Thu Apr 19 16:39:49 CEST 2018
Hi Changpeng, Tomasz,
Any chance that you resubmit the series now that the Qemu changes adding
a protocol feature flag has been accepted?
Cheers,
Maxime
On 03/28/2018 12:56 PM, Maxime Coquelin wrote:
>
>
> On 03/28/2018 12:23 PM, Liu, Changpeng wrote:
>>
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>> Sent: Wednesday, March 28, 2018 6:11 PM
>>> To: Liu, Changpeng <changpeng.liu at intel.com>; Kulasek, TomaszX
>>> <tomaszx.kulasek at intel.com>; yliu at fridaylinux.org
>>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>>> <james.r.harris at intel.com>; Wodkowski, PawelX
>>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Tan, Jianfeng
>>> <jianfeng.tan at intel.com>
>>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio configuration space
>>> messages
>>>
>>>
>>>
>>> On 03/28/2018 12:03 PM, Liu, Changpeng wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>>>> Sent: Wednesday, March 28, 2018 5:58 PM
>>>>> To: Liu, Changpeng <changpeng.liu at intel.com>; Kulasek, TomaszX
>>>>> <tomaszx.kulasek at intel.com>; yliu at fridaylinux.org
>>>>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>>>>> <james.r.harris at intel.com>; Wodkowski, PawelX
>>>>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Tan, Jianfeng
>>>>> <jianfeng.tan at intel.com>
>>>>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio configuration
>>>>> space
>>>>> messages
>>>>>
>>>>>
>>>>>
>>>>> On 03/28/2018 11:50 AM, Liu, Changpeng wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>>>>>> Sent: Wednesday, March 28, 2018 5:12 PM
>>>>>>> To: Kulasek, TomaszX <tomaszx.kulasek at intel.com>;
>>>>>>> yliu at fridaylinux.org
>>>>>>> Cc: Verkamp, Daniel <daniel.verkamp at intel.com>; Harris, James R
>>>>>>> <james.r.harris at intel.com>; Wodkowski, PawelX
>>>>>>> <pawelx.wodkowski at intel.com>; dev at dpdk.org; Liu, Changpeng
>>>>>>> <changpeng.liu at intel.com>; Tan, Jianfeng <jianfeng.tan at intel.com>
>>>>>>> Subject: Re: [dpdk-dev] [PATCH v2] vhost: add virtio
>>>>>>> configuration space
>>>>>>> messages
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 03/27/2018 05:35 PM, Tomasz Kulasek wrote:
>>>>>>>> This patch adds new vhost user messages GET_CONFIG and SET_CONFIG
>>>>> used
>>>>>>>> for get/set virtio device's configuration space.
>>>>>>>>
>>>>>>>> Signed-off-by: Changpeng Liu <changpeng.liu at intel.com>
>>>>>>>> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek at intel.com>
>>>>>>>> ---
>>>>>>>> Changes in v2:
>>>>>>>> - code cleanup
>>>>>>>>
>>>>>>>> lib/librte_vhost/rte_vhost.h | 4 ++++
>>>>>>>> lib/librte_vhost/vhost_user.c | 22 ++++++++++++++++++++++
>>>>>>>> lib/librte_vhost/vhost_user.h | 16 ++++++++++++++++
>>>>>>>> 3 files changed, 42 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_vhost/rte_vhost.h
>>>>>>>> b/lib/librte_vhost/rte_vhost.h
>>>>>>>> index d332069..fe30518 100644
>>>>>>>> --- a/lib/librte_vhost/rte_vhost.h
>>>>>>>> +++ b/lib/librte_vhost/rte_vhost.h
>>>>>>>> @@ -84,6 +84,10 @@ struct vhost_device_ops {
>>>>>>>> int (*new_connection)(int vid);
>>>>>>>> void (*destroy_connection)(int vid);
>>>>>>>>
>>>>>>>> + int (*get_config)(int vid, uint8_t *config, uint32_t
>>>>>>>> config_len);
>>>>>>>> + int (*set_config)(int vid, uint8_t *config, uint32_t offset,
>>>>>>>> + uint32_t len, uint32_t flags);
>>>>>>>> +
>>>>>>>> void *reserved[2]; /**< Reserved for future extension */
>>>>>>>
>>>>>>> You are breaking the ABI, as you grow the size of the ops struct.
>>>>>>>
>>>>>>> Also, I'm wondering if we shouldn't have a different ops for
>>>>>>> external
>>>>>>> backends. Here these ops are more intended to the application, we
>>>>>>> could
>>>>>>> have a specific ops struct for external backends IMHO.
>>>>>>>
>>>>>>>> };
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>>>> b/lib/librte_vhost/vhost_user.c
>>>>>>>> index 90ed211..0ed6a5a 100644
>>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>>> @@ -50,6 +50,8 @@ static const char
>>>>> *vhost_message_str[VHOST_USER_MAX]
>>>>>>> = {
>>>>>>>> [VHOST_USER_NET_SET_MTU] = "VHOST_USER_NET_SET_MTU",
>>>>>>>> [VHOST_USER_SET_SLAVE_REQ_FD] =
>>>>>>> "VHOST_USER_SET_SLAVE_REQ_FD",
>>>>>>>> [VHOST_USER_IOTLB_MSG] = "VHOST_USER_IOTLB_MSG",
>>>>>>>> + [VHOST_USER_GET_CONFIG] = "VHOST_USER_GET_CONFIG",
>>>>>>>> + [VHOST_USER_SET_CONFIG] = "VHOST_USER_SET_CONFIG",
>>>>>>>> };
>>>>>>>>
>>>>>>>> static uint64_t
>>>>>>>> @@ -1355,6 +1357,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>> * would cause a dead lock.
>>>>>>>> */
>>>>>>>> switch (msg.request.master) {
>>>>>>>> + case VHOST_USER_SET_CONFIG:
>>>>>>>
>>>>>>> It seems VHOST_USER_GET_CONFIG is missing here.
>>>>>>>
>>>>>>>> case VHOST_USER_SET_FEATURES:
>>>>>>>> case VHOST_USER_SET_PROTOCOL_FEATURES:
>>>>>>>> case VHOST_USER_SET_OWNER:
>>>>>>>> @@ -1380,6 +1383,25 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>> }
>>>>>>>>
>>>>>>>> switch (msg.request.master) {
>>>>>>>> + case VHOST_USER_GET_CONFIG:
>>>>>>>> + if (dev->notify_ops->get_config(dev->vid,
>>>>>>> Please check ->get_config is set before calling it.
>>>>>>>
>>>>>>>> + msg.payload.config.region,
>>>>>>>> + msg.payload.config.size) != 0) {
>>>>>>>> + msg.size = sizeof(uint64_t);
>>>>>>>> + }
>>>>>>>> + send_vhost_reply(fd, &msg);
>>>>>>>> + break;
>>>>>>>> + case VHOST_USER_SET_CONFIG:
>>>>>>>> + if ((dev->notify_ops->set_config(dev->vid,
>>>>>>> Ditto.
>>>>>>>
>>>>>>>> + msg.payload.config.region,
>>>>>>>> + msg.payload.config.offset,
>>>>>>>> + msg.payload.config.size,
>>>>>>>> + msg.payload.config.flags)) != 0) {
>>>>>>>> + ret = 1;
>>>>>>>> + } else {
>>>>>>>> + ret = 0;
>>>>>>>> + }
>>>>>>>
>>>>>>> ret = dev->notify_ops->set_config instead?
>>>>>>>> + break;
>>>>>>>> case VHOST_USER_GET_FEATURES:
>>>>>>>> msg.payload.u64 = vhost_user_get_features(dev);
>>>>>>>> msg.size = sizeof(msg.payload.u64);
>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.h
>>>>>>>> b/lib/librte_vhost/vhost_user.h
>>>>>>>> index d4bd604..25cc026 100644
>>>>>>>> --- a/lib/librte_vhost/vhost_user.h
>>>>>>>> +++ b/lib/librte_vhost/vhost_user.h
>>>>>>>> @@ -14,6 +14,11 @@
>>>>>>>>
>>>>>>>> #define VHOST_MEMORY_MAX_NREGIONS 8
>>>>>>>>
>>>>>>>> +/*
>>>>>>>> + * Maximum size of virtio device config space
>>>>>>>> + */
>>>>>>>> +#define VHOST_USER_MAX_CONFIG_SIZE 256
>>>>>>>> +
>>>>>>>> #define VHOST_USER_PROTOCOL_F_MQ 0
>>>>>>>> #define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1
>>>>>>>> #define VHOST_USER_PROTOCOL_F_RARP 2
>>>>>>>
>>>>>>> Shouldn't there be a protocol feature associated to these new
>>>>>>> messages?
>>>>>>> Else how QEMU knows the backend supports it or not?
>>>>>>>
>>>>>>> I looked at QEMU code and indeed no protocol feature associated,
>>>>>>> that's
>>>>>>> strange...
>>>>>> Nice to have, for now not all the QEMU host driver need to get this
>>>>> configuration space from slave backend
>>>>>> when getting start. This message can be used for migration of
>>>>>> vhost-user
>>>>> devices.
>>>>>
>>>>> So if QEMU sends this message but the DPDK version does not support it
>>>>> yet, vhost_user_msg_handler() will return an error ("vhost read
>>>>> incorrect message") and the socket will be closed.
>>>>>
>>>>> How do we overcome this? I think we really need a spec update ASAP,
>>>>> before QEMU v2.12 is out (-rc1 already).
>>>>>
>>>>> Do you have time to take care of this?
>>>> For now there are no other users except us care about this message,
>>>> :), it's no
>>> hurry.
>>>> I can take this after QEMU 2.12 release adding a new protocol
>>>> feature bit.
>>>
>>> Are you sure?
>>> If I understand the code correctly, as the guest writes in config regs
>>> of a virtio-blk device, .set_config callback will be called.
>> Exactly.
>>>
>>> If you have a vhost-user backend, it will receive the SET_CONFIG
>>> request, no?
>> For now this only enabled for QEMU vhost-user-blk driver, QEMU
>> virtio-blk driver didn't have such issue.
>
> Right.
> But it will be really painful to manage for example for cross-version
> live migration. Or when you'll want to use QEMU v2.13+ with a DPDK
> v18.05 backend, the protocol feature won't be negotiated.
>
> Really, this is important to get it right at the beginning.
>
> Thanks,
> Maxime
>>>
>>> Cheers,
>>> Maxime
>>>
>>>>>
>>>>> Thanks,
>>>>> Maxime
More information about the dev
mailing list