[dpdk-dev] [PATCH 0/4 for 2.3] vhost-user live migration support

Thibaut Collet thibaut.collet at 6wind.com
Tue Dec 15 12:43:15 CET 2015


On Tue, Dec 15, 2015 at 11:05 AM, Peter Xu <peterx at redhat.com> wrote:

> On Tue, Dec 15, 2015 at 11:45:56AM +0300, Pavel Fedin wrote:
> >  To tell the truth, i don't know. I am also learning qemu internals on
> the fly. Indeed, i see that it should announce itself. But
> > this brings up a question: why do we need special announce procedure in
> vhost-user then?
>
> I have the same question. Here is my guess...
>
> In customized networks, maybe people are not using ARP at all? When
> we use DPDK, we directly pass through the network logic inside
> kernel itself. So logically all the network protocols could be
> customized by the user of it. In the customized network, maybe there
> is some other protocol (rather than RARP) that would do the same
> thing as what ARP/RARP does. So, this SEND_RARP request could give
> the vhost-user backend a chance to format its own announce packet
> and broadcast (in the SEND_RARP request, the guest's mac address
> will be appended).
>
> CCing Victor to better know the truth...
>
> Peter
>


Hi,

After a migration, to avoid network outage, the guest must announce its new
location to the L2 layer, typically with a GARP. Otherwise requests sent to
the guest arrive to the old host until a ARP request is sent (after 30
seconds) or the guest sends some data.

QEMU implementation of self announce after a migration with a vhost backend
is the following:
 - If the VIRTIO_GUEST_ANNOUNCE feature has been negotiated the guest sends
automatically a GARP.
 - Else if the vhost backend implements VHOST_USER_SEND_RARP this request
is sent to the vhost backend. When this message is received the vhost
backend must act as it receives a RARP from the guest (purpose of this RARP
is to update switches' MAC->port maaping as a GARP). This RARP is a false
one, created by the vhost backend,
 - Else nothing is done and we have a network outage until a ARP is sent or
the guest sends some data.


VIRTIO_GUEST_ANNOUNCE feature is negotiated if:
  - the vhost backend announces the support of this feature. Maybe QEMU can
be updated to support unconditionnaly this feature
  - the virtio driver of the guest implements this feature. It is not the
case for old kernel or dpdk virtio pmd.

Regarding dpdk to have a migration of vhost interface with limited network
outage we have to:

  - Implement management VHOST_USER_SEND_RARP request to emulate a fake
RARP for guest

To do that we have to consider two kinds of guest:
  1. Guest with virtio driver implementing VIRTIO_GUEST_ANNOUNCE feature
  2. Guest with virtio driver that does not have the VIRTIO_GUEST_ANNOUNCE
feature. This is the case with old kernel or guest running a dpdk (virtio
pmd of dpdk does not have this feature)

Guest with VIRTIO_GUEST_ANNOUNCE feature sends automatically some GARP
after a migration if this feature has been negotiated. So the only thing to
do it is to negotiate the VIRTIO_GUEST_ANNOUNCE feature between QEMU, vhost
backend and the guest.
For this kind of guest the vhost-backend must announce the support of
VIRTIO_GUEST_ANNOUNCE feature. As vhost-backend has no particular action to
do in this case the support of VIRTIO_GUEST_ANNOUNCE feature can be
unconditionally set in QEMU in the future.

For guest without VIRTIO_GUEST_ANNOUNCE feature we have to send a fake
RARP: QEMU knows the MAC address of the guest and can create and broadcast
a RARP. But in case of vhost-backend QEMU is not able to broadcast this
fake RARP and must ask to the vhost backend to do it through the
VHOST_USER_SEND_RARP request. When the vhost backend receives this message
it must create a fake RARP message (as done by QEMU) and do the appropriate
operation as this message has been sent by the guest through the virtio
rings.


To solve this point 2 solutions are implemented:
 - After the migration the guest automatically sends GARP. This solution
occurs if VIRTIO_GUEST_ANNOUNCE feature has been negotiated between QEMU
and the guest.
         * VIRTIO_GUEST_ANNOUNCE


More information about the dev mailing list