[dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API

Yuanhan Liu yuanhan.liu at linux.intel.com
Tue Mar 28 14:45:20 CEST 2017

This patchset makes DPDK vhost library generic enough, so that we could
build other vhost-user drivers on top of it. For example, SPDK (Storage
Performance Development Kit) is trying to enable vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.

Change log

v2: - rebase
    - updated release note
    - updated API comments
    - renamed rte_vhost_get_vhost_memory to rte_vhost_get_mem_table

    - added a new device callback: features_changed(), bascially for live
      migration support
    - introduced rte_vhost_driver_start() to start a specific driver
    - misc fixes

v3: - rebaseon top of vhost-user socket fix
    - fix reconnect
    - fix shared build
    - fix typos

Major API/ABI Changes summary

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);
  * rte_vhost_driver_start(socket_file);
    This function replaces rte_vhost_driver_session_start(). Check patch
    18 for more information.

- new APIs to fetch guest and vring info
  * rte_vhost_get_mem_table(vid, mem);
  * rte_vhost_get_negotiated_features(vid);
  * rte_vhost_get_vhost_vring(vid, vring_idx, vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory

- a new device ops callback: features_changed().

Some design choices

While making this patchset, I met quite few design choices and here are
two of them, with the issue and the reason I made such choices provided.
Please let me know if you have any comments (or better ideas).

Export public structures or not

I made an ABI refactor last time (v16.07): move all the structures
internally and let applications use a "vid" to reference the internal
struct. With that, I hope we could never worry about the annoying ABI

It works great (and as expected) since then, as far as we only support
virito-net, as far as we can handle all the descs inside vhost lib. It
becomes problematic when a user wants to implement a vhost-user driver
somewhere. For example, it needs do the GPA to VVA translation. Without
any structs exported, some functions like gpa_to_vva() can't be inlined.
Calling it would be costly, especially it's a function we have to invoke
for processing each vring desc.

For that reason, the guest memory regions are exported. With that, the
gpa_to_vva could be inlined.

Add helper functions to fetch/update descs or not

I intended to do it like this way: introduce one function to get @count
of descs from a specific vring and another one to update the used descs.
It's something like
    rte_vhost_vring_get_descs(vid, vring_idx, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vid, vring_idx, count, offset, descs);

With that, vhost-user driver programmer's task would be easier, as he/she
doesn't have to parse the descs any more (such as to handle indirect desc).

But judging that virtio 1.1 is just emerged and it proposes a completely
ring layout, and most importantly, the vring desc structure is also changed,
I'd like to hold the introducation of such two functions. Otherwise, it's
very likely the two will be invalid when virtio 1.1 is out. Though I think
it may could be addressed with a care design, something like making the IOV
generic enough:

	struct rte_vhost_iov {
		uint64_t	gpa;
		uint64_t	vva;
		uint64_t	len;

Instead, I go with the other way: introduce few APIs to export all the vring
infos (vring size, vring addr, callfd, etc), and let the vhost-user driver
read and update the descs. Those info could be passed to vhost-user driver
by introducing one API for each, but for saving few APIs and reducing few
calls for the programmer, I packed few key fields into a new structure, so
that it can be fetched with one call:
        struct rte_vhost_vring {
                struct vring_desc       *desc;
                struct vring_avail      *avail;
                struct vring_used       *used;
                uint64_t                log_guest_addr;
                int                     callfd;
                int                     kickfd;
                uint16_t                size;

When virtio 1.1 comes out, likely a simple change like following would
just work:
        struct rte_vhost_vring {
		union {
			struct {
                		struct vring_desc       *desc;
                		struct vring_avail      *avail;
                		struct vring_used       *used;
                		uint64_t                log_guest_addr;
			struct desc	*desc_1_1;	/* vring addr for virtio 1.1 */
                int                     callfd;
                int                     kickfd;
                uint16_t                size;

AFAIK, it's not an ABI breakage. Even if it does, we could introduce a new
API to get the virtio 1.1 ring address.

Those fields are the minimum set I got for a specific vring, with the mind
it would bring the minimum chance to break ABI for future extension. If we
need more info, we could introduce a new API.

OTOH, for getting the best performance, the two functions also have to be
inlined ("vid + vring_idx" combo is replaced with "vring"):
    rte_vhost_vring_get_descs(vring, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vring, count, offset, descs);

That said, one way or another, we have to export rte_vhost_vring struct.
For this reason, I didn't rush into introducing the two APIs.


Yuanhan Liu (22):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: add features changed callback
  vhost: export APIs for live migration support
  vhost: introduce API to start a specific driver
  vhost: rename header file
  vhost: workaround the build dependency on mbuf header
  vhost: do not destroy device on repeat mem table message
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/prog_guide/vhost_lib.rst         |  42 +--
 doc/guides/rel_notes/deprecation.rst        |   9 -
 doc/guides/rel_notes/release_17_05.rst      |  40 +++
 drivers/net/vhost/rte_eth_vhost.c           | 101 ++-----
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  23 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       | 100 +++++--
 examples/vhost/main.h                       |  33 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/fd_man.c                   |   9 +-
 lib/librte_vhost/fd_man.h                   |   2 +-
 lib/librte_vhost/rte_vhost.h                | 423 ++++++++++++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  16 +-
 lib/librte_vhost/rte_virtio_net.h           | 208 --------------
 lib/librte_vhost/socket.c                   | 227 ++++++++++++---
 lib/librte_vhost/vhost.c                    | 229 ++++++++-------
 lib/librte_vhost/vhost.h                    | 113 +++++---
 lib/librte_vhost/vhost_user.c               | 115 ++++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  71 ++---
 25 files changed, 1526 insertions(+), 687 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h


More information about the dev mailing list