[dpdk-dev] How do you setup a VM in Promiscuous Mode using PCI Pass-Through (SR-IOV)?

Qiu, Michael michael.qiu at intel.com
Wed May 20 11:14:20 CEST 2015


You should refer this patch about VFIO in kernel

https://lkml.org/lkml/2014/6/13/394

Seems your device is unable to do pci pass through using vfio. Could you
try pci-stub driver?

Thanks,
Michael

On 5/20/2015 3:24 AM, Assaad, Sami (Sami) wrote:
> Hello Michael,
>
> I've updated the kernel and QEMU. Here are the packages I'm using:
>
> --> CentOS 7 - 3.10.0-229.4.2.el7.x86_64
>     - qemu-kvm-1.5.3-86.el7_1.2.x86_64
>     - libvirt-1.2.8-16.el7_1.3.x86_64
>     - virt-manager-1.1.0-12.el7.noarch
>     - virt-what-1.13-5.el7.x86_64
>     - libvirt-glib-0.1.7-3.el7.x86_64
>
> I've modified the virtual machine XML file to include the following:
>
> <hostdev mode='subsystem' type='pci' managed='yes'>
>   <driver name='vfio'/>
>     <source>
>       <address domain='0x0000' bus='0x04' slot='0x10' function='0x0'/>
>     </source>
> </hostdev>
> <hostdev mode='subsystem' type='pci' managed='yes'>
>   <driver name='vfio'/>
>     <source>
>       <address domain='0x0000' bus='0x04' slot='0x10' function='0x1'/>
>     </source>
> </hostdev>
>
>
> The syslog error I'm obtaining relating to the iommu is the following:
> #dmesg | grep -e DMAR -e IOMMU
>
> [ 3362.370564] vfio-pci 0000:04:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
>
>
> From the /var/log/messages file, the complete VM log is the following:
>
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): carrier is OFF
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): new Tun device (driver: 'unknown' ifindex: 30)
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): exported as /org/freedesktop/NetworkManager/Devices/29
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (virbr0): bridge port vnet0 was attached
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): enslaved to virbr0
> May 19 15:10:12 ni-nfvhost01 kernel: device vnet0 entered promiscuous mode
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): link connected
> May 19 15:10:12 ni-nfvhost01 kernel: virbr0: port 2(vnet0) entered listening state
> May 19 15:10:12 ni-nfvhost01 kernel: virbr0: port 2(vnet0) entered listening state
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: unavailable -> disconnected (reason 'connection-assumed') [20 30 41]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: starting connection 'vnet0'
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 1 of 5 (Device Prepare) scheduled...
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 1 of 5 (Device Prepare) started...
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: disconnected -> prepare (reason 'none') [30 40 0]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 2 of 5 (Device Configure) scheduled...
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 1 of 5 (Device Prepare) complete.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 2 of 5 (Device Configure) starting...
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: prepare -> config (reason 'none') [40 50 0]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 2 of 5 (Device Configure) successful.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 3 of 5 (IP Configure Start) scheduled.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 2 of 5 (Device Configure) complete.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 3 of 5 (IP Configure Start) started...
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: config -> ip-config (reason 'none') [50 70 0]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: Stage 3 of 5 (IP Configure Start) complete.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: ip-config -> secondaries (reason 'none') [70 90 0]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: secondaries -> activated (reason 'none') [90 100 0]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): Activation: successful, device activated.
> May 19 15:10:12 ni-nfvhost01 dbus-daemon: dbus[1295]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
> May 19 15:10:12 ni-nfvhost01 dbus[1295]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
> May 19 15:10:12 ni-nfvhost01 systemd: Starting Network Manager Script Dispatcher Service...
> May 19 15:10:12 ni-nfvhost01 systemd: Starting Virtual Machine qemu-vNIDS-VM1.
> May 19 15:10:12 ni-nfvhost01 systemd-machined: New machine qemu-vNIDS-VM1.
> May 19 15:10:12 ni-nfvhost01 systemd: Started Virtual Machine qemu-vNIDS-VM1.
> May 19 15:10:12 ni-nfvhost01 dbus-daemon: dbus[1295]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
> May 19 15:10:12 ni-nfvhost01 dbus[1295]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
> May 19 15:10:12 ni-nfvhost01 systemd: Started Network Manager Script Dispatcher Service.
> May 19 15:10:12 ni-nfvhost01 nm-dispatcher: Dispatching action 'up' for vnet0
> May 19 15:10:12 ni-nfvhost01 kvm: 1 guest now active
> May 19 15:10:12 ni-nfvhost01 systemd: Unit iscsi.service cannot be reloaded because it is inactive.
> May 19 15:10:12 ni-nfvhost01 kernel: vfio-pci 0000:04:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
> May 19 15:10:12 ni-nfvhost01 kernel: virbr0: port 2(vnet0) entered disabled state
> May 19 15:10:12 ni-nfvhost01 kernel: device vnet0 left promiscuous mode
> May 19 15:10:12 ni-nfvhost01 kernel: virbr0: port 2(vnet0) entered disabled state
> May 19 15:10:12 ni-nfvhost01 avahi-daemon[1280]: Withdrawing workstation service for vnet0.
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): device state change: activated -> unmanaged (reason 'removed') [100 10 36]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <info>  (vnet0): deactivating device (reason 'removed') [36]
> May 19 15:10:12 ni-nfvhost01 NetworkManager[1371]: <warn>  (virbr0): failed to detach bridge port vnet0
> May 19 15:10:12 ni-nfvhost01 nm-dispatcher: Dispatching action 'down' for vnet0
> May 19 15:10:12 ni-nfvhost01 journal: Unable to read from monitor: Connection reset by peer
> May 19 15:10:12 ni-nfvhost01 journal: internal error: early end of file from monitor: possible problem:
> 2015-05-19T19:10:12.674077Z qemu-kvm: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x9: vfio: failed to set iommu for container: Operation not permitted
> 2015-05-19T19:10:12.674118Z qemu-kvm: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x9: vfio: failed to setup container for group 19
> 2015-05-19T19:10:12.674128Z qemu-kvm: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x9: vfio: failed to get group 19
> 2015-05-19T19:10:12.674141Z qemu-kvm: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x9: Device initialization failed.
> 2015-05-19T19:10:12.674155Z qemu-kvm: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x9: Device 'vfio-pci' could not be initialized
>
> May 19 15:10:12 ni-nfvhost01 kvm: 0 guests now active
> May 19 15:10:12 ni-nfvhost01 systemd-machined: Machine qemu-vNIDS-VM1 terminated.
> May 19 15:11:01 ni-nfvhost01 systemd: Created slice user-0.slice.
> May 19 15:11:01 ni-nfvhost01 systemd: Starting Session 329 of user root.
>
>
> Overall Hypothesis: The issue seems to be related to the Ethernet Controller's interfaces which I'm trying to bring into the VM. My Ethernet Controller is : Intel 10G x540-AT2 (rev 01).
>                     The problem is associated to RMRR. 
>                     Can this issue be attributed to my BIOS? My Bios is the following: ProLiant System BIOS P89 V1.21 11/03/2014.
>
> Thanks in advance.
>
> Best Regards,
> Sami.
>
> -----Original Message-----
> From: Qiu, Michael [mailto:michael.qiu at intel.com] 
> Sent: Monday, May 18, 2015 6:01 AM
> To: Assaad, Sami (Sami); Richardson, Bruce
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] How do you setup a VM in Promiscuous Mode using PCI Pass-Through (SR-IOV)?
>
> Hi, Sami
>
> Could you mind to supply the syslog? Especially iommu related parts.
>
> Also you could update the qemu or kernel to see if this issue still exists.
>
>
> Thanks,
> Michael
>
> On 5/16/2015 3:31 AM, Assaad, Sami (Sami) wrote:
>> On Fri, May 15, 2015 at 12:54:19PM +0000, Assaad, Sami (Sami) wrote:
>>> Thanks Bruce for your reply.
>>>
>>> Yes, your idea of bringing the PF into the VM looks like an option. However, how do you configure the physical interfaces within the VM supporting SRIOV?
>>> I always believed that the VM needed to be associated with a virtual/emulated interface card. With your suggestion, I would actually configure the physical interface card/non-emulated within the VM.
>>>
>>> If you could provide me some example configuration commands, it would be really appreciated. 
>>>
>> You'd pass in the PF in the same way as the VF, just skip all the steps creating the VF on the host. To the system and hypervisor, both are just PCI devices!
>>
>> As for configuration, the setup and configuration of the PF in the guest is exactly the same as on the host - it's the same hardware with the same PCI bars.
>> It's the IOMMU on your platform that takes care of memory isolation and address translation and that should work with either PF or VF.
>>
>> Regards,
>> /Bruce
>>
>>> Thanks in advance.
>>>
>>> Best Regards,
>>> Sami.
>>>
>>> -----Original Message-----
>>> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
>>> Sent: Friday, May 15, 2015 5:27 AM
>>> To: Stephen Hemminger
>>> Cc: Assaad, Sami (Sami); dev at dpdk.org
>>> Subject: Re: [dpdk-dev] How do you setup a VM in Promiscuous Mode using PCI Pass-Through (SR-IOV)?
>>>
>>> On Thu, May 14, 2015 at 04:47:19PM -0700, Stephen Hemminger wrote:
>>>> On Thu, 14 May 2015 21:38:24 +0000
>>>> "Assaad, Sami (Sami)" <sami.assaad at alcatel-lucent.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> My Hardware consists of the following:
>>>>>   - DL380 Gen 9 Server supporting two Haswell Processors (Xeon CPU E5-2680 v3 @ 2.50GHz)
>>>>>   - An x540 Ethernet Controller Card supporting 2x10G ports.
>>>>>
>>>>> Software:
>>>>>   - CentOS 7 (3.10.0-229.1.2.el7.x86_64)
>>>>>   - DPDK 1.8
>>>>>
>>>>> I want all the network traffic received on the two 10G ports to be transmitted to my VM. The issue is that the Virtual Function / Physical Functions have setup the internal virtual switch to only route Ethernet packets with destination MAC address matching the VM virtual interface MAC. How can I configure my virtual environment to provide all network traffic to the VM...i.e. set the virtual functions for both PCI devices in Promiscuous mode?
>>>>>
>>>>> [ If a l2fwd-vf example exists, this would actually solve this 
>>>>> problem ... Is there a DPDK l2fwd-vf example available? ]
>>>>>
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Best Regards,
>>>>> Sami Assaad.
>>>> This is a host side (not DPDK) issue.
>>>>
>>>> Intel PF driver will not allow guest (VF) to go into promiscious 
>>>> mode since it would allow traffic stealing which is a security violation.
>>> Could you maybe try passing the PF directly into the VM, rather than a VF based off it? Since you seem to want all traffic to go to the one VM, there seems little point in creating a VF on the device, and should let the VM control the whole NIC directly.
>>>
>>> Regards,
>>> /Bruce
>> Hi Bruce,
>>
>> I was provided two options:
>> 1. Pass the PF directly into the VM
>> 2. Use ixgbe VF mirroring
>>
>> I decided to first try your proposal of passing the PF directly into the VM. However, I ran into some issues. 
>> But prior to providing the problem details, the following is my  server environment:
>> I'm using CentOS 7 KVM/QEMU
>> [root at ni-nfvhost01 qemu]# uname -a
>> Linux ni-nfvhost01 3.10.0-229.1.2.el7.x86_64 #1 SMP Fri Mar 27 
>> 03:04:26 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>>
>> [root at ni-nfvhost01 qemu]# lspci -n -s 04:00.0
>> 04:00.0 0200: 8086:1528 (rev 01)
>>
>> [root at ni-nfvhost01 qemu]# lspci | grep -i eth
>> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 
>> Gigabit Ethernet PCIe (rev 01)
>> 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 
>> Gigabit Ethernet PCIe (rev 01)
>> 02:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 
>> Gigabit Ethernet PCIe (rev 01)
>> 02:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 
>> Gigabit Ethernet PCIe (rev 01)
>> 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 
>> 10-Gigabit X540-AT2 (rev 01)
>> 04:00.1 Ethernet controller: Intel Corporation Ethernet Controller 
>> 10-Gigabit X540-AT2 (rev 01)
>>
>> - The following is my grub execution:
>> [root at ni-nfvhost01 qemu]# cat  /proc/cmdline
>> BOOT_IMAGE=/vmlinuz-3.10.0-229.1.2.el7.x86_64 
>> root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap 
>> vconsole.font=latarcyrheb-sun17 rd.lvm.lv=centos/root crashkernel=auto 
>> vconsole.keymap=us rhgb quiet iommu=pt intel_iommu=on hugepages=8192
>>
>>
>> This is the error I'm obtaining when the VM has one of the PCI devices associated to the Ethernet Controller card:
>> [root at ni-nfvhost01 qemu]# qemu-system-x86_64 -m 2048 -vga std -vnc :0 
>> -net none -enable-kvm -device vfio-pci,host=04:00.0,id=net0
>> qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=net0: vfio: 
>> failed to set iommu for container: Operation not permitted
>> qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=net0: vfio: 
>> failed to setup container for group 19
>> qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=net0: vfio: 
>> failed to get group 19
>> qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=net0: Device initialization failed.
>> qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=net0: Device 
>> 'vfio-pci' could not be initialized
>>
>> Hence, I tried the following, but again with no success :-( Decided to 
>> bind the  PCI device associated to the Ethernet Controller to vfio (To 
>> enable the VM PCI device access and have the IOMMU operate properly) Here are the commands I used to configure the PCI pass-through for the Ethernet device:
>>
>> # modprobe vfio-pci
>>
>> 1) Device I want to assign as passthrough:
>> 04:00.0
>>
>> 2) Find the vfio group of this device
>>
>> # readlink /sys/bus/pci/devices/0000:04:00.0/iommu_group
>> ../../../../kernel/iommu_groups/19
>>  
>> ( IOMMU Group = 19 )
>>
>> 3) Check the devices in the group:
>> # ls /sys/bus/pci/devices/0000:04:00.0/iommu_group/devices/
>> 0000:04:00.0
>>  
>> (so this group has only 1 device)
>>  
>> 4) Unbind from device driver
>> # echo 0000:04:00.0 >/sys/bus/pci/devices/0000:04:00.0/driver/unbind
>>  
>> 5) Find vendor & device ID
>> $ lspci -n -s 04:00.0
>>> 04:00.0 0200: 8086:1528 (rev 01)
>>  
>> 6) Bind to vfio-pci
>> $ echo 8086 1528 > /sys/bus/pci/drivers/vfio-pci/new_id
>>  
>> (this results in a new device node "/dev/vfio/19",  which is what qemu 
>> will use to setup the device for passthrough)
>>  
>> 7) chown the device node so it is accessible by qemu user:
>> # chown qemu /dev/vfio/19; chgrp qemu /dev/vfio/19
>>
>> Now, on the VM side, using virt-manager, I removed the initial PCI device and re-added it.
>> After re-booting the VM, I obtained the same issue.
>>
>> What am I doing wrong?
>>
>> Thanks a million!
>>
>> Best Regards,
>> Sami.
>>
>>
>



More information about the dev mailing list