[dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when guest VM restarts network service

Marco Varlese marco.varlese at suse.com
Mon Feb 21 15:19:50 CET 2022


Hello,

I have been seeing the same issue with several different DPDK-OVS 
versions as well as QEMU versions.

It looks like an issue with handling the VHOST_USER_GET_VRING_BASE once 
the application in the guest is restarted. It might probably have to do 
with QEMU asynchronous message passing...

I am not an expert on the vhost/virtio so trying to have your help with 
this. Has anybody had the chance to look into this issue and found a 
solution or workaround?


Cheers,
Marco


On 11/26/21 15:09, Bendror, Eran (Nokia - US) wrote:
> Hi,
> 
> Internally the VM is using DPDK 17.05, on Centos7.9 – but this seems to 
> be reproducing with guest level 18.11 as well.
> 
> The issue is when the DPDK PMDs get started at guest, so the assumption 
> is that that presents bad / inaccessible memory towards the host.
> 
> We did notice some mis-use at the guest of selinux permissions, and 
> removing that helped reducing the frequency significantly.
> 
> Is there a way to map the shared memory between VM and host to see where 
> is the segmentation fault coming from?
> 
> I will see if I can upload the VM xml, but it is a multi-queue 4 port VM.
> 
> Thanks for the assistance,
> 
> Eran
> 
> *From:* Xia, Chenbo <chenbo.xia at intel.com>
> *Sent:* Friday, November 26, 2021 4:25 AM
> *To:* Bendror, Eran (Nokia - US) <eran.bendror at nokia.com>; 
> ktraynor at redhat.com
> *Cc:* ayeh at cisco.com; dev at dpdk.org; Stokes, Ian <ian.stokes at intel.com>; 
> maxime.coquelin at redhat.com; yega at cisco.com; Marco Varlese 
> <marco.varlese at suse.com>
> *Subject:* RE: [dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when 
> guest VM restarts network service
> 
> Hi,
> 
> Is it possible that you can provide more info about this isuee. I mean: 
> qemu cmdline/libvirt xml, ovs cmdline, guest driver version and etc… Or 
> it’s hard to reproduce the issue.
> 
> Thanks,
> 
> Chenbo
> 
> *From:* Bendror, Eran (Nokia - US) <eran.bendror at nokia.com 
> <mailto:eran.bendror at nokia.com>>
> *Sent:* Wednesday, November 17, 2021 10:42 PM
> *To:* ktraynor at redhat.com <mailto:ktraynor at redhat.com>
> *Cc:* ayeh at cisco.com <mailto:ayeh at cisco.com>; Xia, Chenbo 
> <chenbo.xia at intel.com <mailto:chenbo.xia at intel.com>>; dev at dpdk.org 
> <mailto:dev at dpdk.org>; Stokes, Ian <ian.stokes at intel.com 
> <mailto:ian.stokes at intel.com>>; maxime.coquelin at redhat.com 
> <mailto:maxime.coquelin at redhat.com>; yega at cisco.com <mailto:yega at cisco.com>
> *Subject:* Re: [dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when 
> guest VM restarts network service
> 
> Hello,
> 
> I am wondering if there was any progress in this topic, we are seeing a 
> very similar issue, where a VM level application restart triggers 
> segmentation fault and failed to allocate MBuf on the host level
> 
> CentOS Linux release 7.8.2003 (Core)
> 
> dpdk-18.11.5-1.el7_8.x86_64
> 
> openvswitch-2.11.0-4.el7.x86_64
> 
> libvirt 4.5.0
> 
> QEMU 4.5.0 (API)
> 
> QEMU 2.12.0
> 
> 3.10.0-1127.13.1.el7.x86_64
> 
> And we get the same crash
> 
> #0  0x00007f96cb72e7ee in rte_memcpy_generic () from 
> /lib64/librte_vhost.so.4
> 
> #1  0x00007f96cb7350f2 in rte_vhost_dequeue_burst () from 
> /lib64/librte_vhost.so.4
> 
> #2  0x00007f96caf97f03 in netdev_dpdk_vhost_rxq_recv () from 
> /lib64/libopenvswitch-2.11.so.0
> 
> #3  0x00007f96caed21e6 in netdev_rxq_recv () from 
> /lib64/libopenvswitch-2.11.so.0
> 
> #4  0x00007f96caea07ca in dp_netdev_process_rxq_port () from 
> /lib64/libopenvswitch-2.11.so.0
> 
> #5  0x00007f96caea0ca5 in pmd_thread_main () from 
> /lib64/libopenvswitch-2.11.so.0
> 
> #6  0x00007f96caf2da3f in ovsthread_wrapper () from 
> /lib64/libopenvswitch-2.11.so.0
> 
> #7  0x00007f96c9ef3ea5 in start_thread () from /lib64/libpthread.so.0
> 
> #8  0x00007f96c94118dd in clone () from /lib64/libc.so.6
> 
> We have tried upgrading host level artifacts:
> 
> dpdk-20.11.3-1.el7.x86_64
> 
> openvswitch-2.16.1-1.el7.x86_64
> 
> With backtrace:
> 
> #0  0x00007f6b8b49748c in virtio_dev_tx_split_legacy () from 
> /lib64/librte_vhost.so.21
> 
> #1  0x00007f6b8b4c0fdb in rte_vhost_dequeue_burst () from 
> /lib64/librte_vhost.so.21
> 
> #2  0x000055bd714c2802 in netdev_dpdk_vhost_rxq_recv ()
> 
> #3  0x000055bd713f8e51 in netdev_rxq_recv ()
> 
> #4  0x000055bd713c9d2a in dp_netdev_process_rxq_port ()
> 
> #5  0x000055bd713ca1f9 in pmd_thread_main ()
> 
> #6  0x000055bd71455cdf in ovsthread_wrapper ()
> 
> #7  0x00007f6b8a6a9ea5 in start_thread () from /lib64/libpthread.so.0
> 
> #8  0x00007f6b89bc78dd in clone () from /lib64/libc.so.6
> 
> Regards,
> 
> Eran
> 



More information about the dev mailing list