[dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM

Oleg Strikov oleg.strikov at canonical.com
Tue May 12 17:57:28 CEST 2015


Hi Pravin, Kevin, and others!
Many thanks for your comments.

>> This approach works fine on the real hardware but makes some issues when we
>> run openvswitch/dpdk inside the virtual machine. I tried both emulated
>> e1000 NIC and virtio NIC and neither of them worked just from the box.
>> Emulated e1000 NIC doesn't support multiple tx queues at all (see
>> http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
>> virtio NIC doesn't support multiple tx queues by default. To enable
>> multiple tx queue for virtio NIC I had to add the following line to the
>> interface section of my libvirt config: '<driver name="vhost" queues="4"/>'

> Good point. We should document this.
> Can you send patch to update README.DPDK?

http://openvswitch.org/pipermail/dev/2015-May/055132.html
Could you review it please?

> It will be nice to make OVS-DPDK work in VM. As I said I am also
> planning on working on it. Thanks for the heads up.

I applied a few dirty hacks and make ovs/dpdk work inside a VM.
That's definitely not something we want to merge. Just a way to
better localize the issues:

http://pastebin.ubuntu.com/11096400/

You may see that I did two things:

Function virtio_dev_start() doesn't do queue/vq_ring (re-)initialization
after consecutive calls to dev_stop() and dev_start() issued by openvswitch.
This approach fails if the number of tx/rx queues changed between the calls.
And that's exactly what happens inside openvswitch which initializes device
in 1rx/1tx mode, starts it, stops it, re-initializes it in 1rx/Ntx mode and
start the device. By removing 'hw->started' check I force the virtio driver
to initialize these new (N-1) tx queues during 2nd startup.

Second issue takes place because both rte_eth_{tx,rx}_queue_setup() can't
be called twice for the same port_id/queue_id. Obvious reason is that
rte_memzone_reserve_aligned() can't be called twice with the same memzone
name without returning the error. I suspect that some other issues of this
sort may exist there as well. To deal with that I avoid initialization of
tx/rx queues which were already initialized.

After applying these changes I get ovs/dpdk working in packets forwarding
mode (from dpdk0 device to dpdk1 device) which means that PMDs are working
fine.

Additional functional issue is that it seems to be impossible to run dpdk
with vfio backend inside qemu-kvm virtual machine. To my understanding,
vfio requires vt-d (iommu) but qemu doesn't provide this functionality
to the guest. I really want to be wrong but that's what I learned so far.
If that's true -- igb_uio is the only option for us.

> Daniele's patch http://openvswitch.org/pipermail/dev/2015-March/052344.html
> also allows for having a limited set of queues available.

Many thanks Kevin for mentioning this.
I like the patch and definitely give it a try.
Hope we'll have it merged soon.

Thanks for helping,
Oleg

On Mon, May 11, 2015 at 3:10 PM, Traynor, Kevin <kevin.traynor at intel.com> wrote:
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pravin Shelar
>> Sent: Friday, May 8, 2015 2:20 AM
>> To: Oleg Strikov
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] Issues met while running openvswitch/dpdk/virtio
>> inside the VM
>>
>> On Thu, May 7, 2015 at 9:22 AM, Oleg Strikov <oleg.strikov at canonical.com>
>> wrote:
>> > Hi DPDK users and developers,
>> >
>> > Few weeks ago I came up with the idea to run openvswitch with dpdk backend
>> > inside qemu-kvm virtual machine. I don't have enough supported NICs yet and
>> > my plan was to start experimenting inside the virtualized environment,
>> > achieve functional state of all the components and then switch to the real
>> > hardware. Additional useful side-effect of doing things inside the vm is
>> > that issues can be easily reproduced by someone else in a different
>> > environment.
>> >
>> > I (fondly) hoped that running openvswitch/dpdk inside the vm would be
>> > simpler than running the same set of components on the real hardware.
>> > Unfortunately I met a bunch of issues on the way. All these issues lie on a
>> > borderline between dpdk and openvswitch but I think that you might be
>> > interested in my story. Please note that I still don't have
>> > openvswitch/dpdk working inside the vm. I definetely have some progress
>> > though.
>> >
>> Thanks for summarizing all the issues.
>> DPDK is testing is done on real hardware and we are planing testing it
>> in VM. This will certainly help in fixing issues sooner.
>>
>> > Q: Does it sound okay from functional (not performance) standpoint to run
>> > openvswitch/dpdk inside the vm? Do we want to be able to do this? Does
>> > anyone from the dpdk development team do this?
>> >
>> > ## Issue 1 ##
>> >
>> > Openvswitch requires backend pmd driver to provide N_CORES tx queues where
>> > N_CORES is the amount of cores available on the machine (openvswitch counts
>> > the amount of cpu* entries inside /sys/devices/system/node/node0/ folder).
>> > To my understanding it doesn't take into account the actual amount of cores
>> > used by dpdk and just allocates tx queue for each available core. You may
>> > refer to this chunk of code for details:
>> > https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067
>> >
>> In case of OVS DPDK, there is no dpdk thread. Therefore all polling
>> cores are managed by OVS and there is no need to account cores for
>> DPDK. You can assign specific cores for OVS to limit number of cores
>> used by OVS.
>>
>> > This approach works fine on the real hardware but makes some issues when we
>> > run openvswitch/dpdk inside the virtual machine. I tried both emulated
>> > e1000 NIC and virtio NIC and neither of them worked just from the box.
>> > Emulated e1000 NIC doesn't support multiple tx queues at all (see
>> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
>> > virtio NIC doesn't support multiple tx queues by default. To enable
>> > multiple tx queue for virtio NIC I had to add the following line to the
>> > interface section of my libvirt config: '<driver name="vhost" queues="4"/>'
>> >
>> Good point. We should document this. Can you send patch to update
>> README.DPDK?
>
> Daniele's patch http://openvswitch.org/pipermail/dev/2015-March/052344.html
> also allows for having a limited set of queues available. The documentation
> patch is a good idea too.
>
>>
>> > ## Issue 2 ##
>> >
>> > Openvswitch calls rte_eth_tx_queue_setup() twice for the same
>> > port_id/queue_id. First call takes place during device initialization (see
>> > call to dpdk_eth_dev_init() inside netdev_dpdk_init():
>> > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522).
>> > Second call takes place when openvswitch tries to add more tx queues to the
>> > device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq():
>> > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697).
>> > Second call not only initialized new queues but tries to re-initialize
>> > existing ones.
>> >
>> > Unfortunately virtio driver can't handle second call of
>> > rte_eth_tx_queue_setup() and returns error here:
>> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316
>> > This happens because memzone with the name portN_tvqN already exists when
>> > second call takes place (memzone has been created during the first call).
>> > To deal with this issue I had to manually add rte_memzone_lookup-based
>> > check for this situation and avoid allocation of a new memzone if it
>> > already exists.
>> >
>> This sounds like issue with virtIO driver. I think we need to fix DPDK
>> upstream for this to work correctly.
>>
>> > Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right
>> > now I can't understand if it's the issue with the virtio pmd driver or
>> > incorrect API usage by openvswitch? Could someone shed some light on this
>> > so I can move forward and maybe propose a fix.
>> >
>> > ## Issue 3 ##
>> >
>> > This issue is also (somehow) related to the fact that openvswitch calls
>> > rte_eth_tx_queue_setup() twice. I fix the previous issue by the method
>> > described above and initialization finishes. The whole machinery starts to
>> > work but crashes at the very beginning (while fetching the first packet
>> > from the NIC maybe). This crash happens here:
>> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588
>> > It takes place because vq_ring structure contains zeros instead of correct
>> > values:
>> > vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}
>> > My understanding is that vq_ring gets initialized after the first call to
>> > rte_eth_tx_queue_setup(), then overwritten by the second call to
>> > rte_eth_tx_queue_setup() but without an appropriate initialization for the
>> > second time. I'm trying to fix this issue right now.
>> >
>> This also sounds like DPDK issue.
>>
>> > Q: Does it sound like a realistic goal to make virtio driver work in
>> > openvswitch-like scenarios? I'm definitely not an expert in the area of
>> > dpdk and can't estimate time and resources required. Maybe it's better to
>> > wait until I get a proper hardware?
>> >
>> It will be nice to make OVS-DPDK work in VM. As I said I am also
>> planning on working on it. Thanks for the heads up.


More information about the dev mailing list