[dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when guest VM restarts network service
Marco Varlese
marco.varlese at suse.com
Mon Feb 21 15:19:50 CET 2022
Hello,
I have been seeing the same issue with several different DPDK-OVS
versions as well as QEMU versions.
It looks like an issue with handling the VHOST_USER_GET_VRING_BASE once
the application in the guest is restarted. It might probably have to do
with QEMU asynchronous message passing...
I am not an expert on the vhost/virtio so trying to have your help with
this. Has anybody had the chance to look into this issue and found a
solution or workaround?
Cheers,
Marco
On 11/26/21 15:09, Bendror, Eran (Nokia - US) wrote:
> Hi,
>
> Internally the VM is using DPDK 17.05, on Centos7.9 – but this seems to
> be reproducing with guest level 18.11 as well.
>
> The issue is when the DPDK PMDs get started at guest, so the assumption
> is that that presents bad / inaccessible memory towards the host.
>
> We did notice some mis-use at the guest of selinux permissions, and
> removing that helped reducing the frequency significantly.
>
> Is there a way to map the shared memory between VM and host to see where
> is the segmentation fault coming from?
>
> I will see if I can upload the VM xml, but it is a multi-queue 4 port VM.
>
> Thanks for the assistance,
>
> Eran
>
> *From:* Xia, Chenbo <chenbo.xia at intel.com>
> *Sent:* Friday, November 26, 2021 4:25 AM
> *To:* Bendror, Eran (Nokia - US) <eran.bendror at nokia.com>;
> ktraynor at redhat.com
> *Cc:* ayeh at cisco.com; dev at dpdk.org; Stokes, Ian <ian.stokes at intel.com>;
> maxime.coquelin at redhat.com; yega at cisco.com; Marco Varlese
> <marco.varlese at suse.com>
> *Subject:* RE: [dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when
> guest VM restarts network service
>
> Hi,
>
> Is it possible that you can provide more info about this isuee. I mean:
> qemu cmdline/libvirt xml, ovs cmdline, guest driver version and etc… Or
> it’s hard to reproduce the issue.
>
> Thanks,
>
> Chenbo
>
> *From:* Bendror, Eran (Nokia - US) <eran.bendror at nokia.com
> <mailto:eran.bendror at nokia.com>>
> *Sent:* Wednesday, November 17, 2021 10:42 PM
> *To:* ktraynor at redhat.com <mailto:ktraynor at redhat.com>
> *Cc:* ayeh at cisco.com <mailto:ayeh at cisco.com>; Xia, Chenbo
> <chenbo.xia at intel.com <mailto:chenbo.xia at intel.com>>; dev at dpdk.org
> <mailto:dev at dpdk.org>; Stokes, Ian <ian.stokes at intel.com
> <mailto:ian.stokes at intel.com>>; maxime.coquelin at redhat.com
> <mailto:maxime.coquelin at redhat.com>; yega at cisco.com <mailto:yega at cisco.com>
> *Subject:* Re: [dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when
> guest VM restarts network service
>
> Hello,
>
> I am wondering if there was any progress in this topic, we are seeing a
> very similar issue, where a VM level application restart triggers
> segmentation fault and failed to allocate MBuf on the host level
>
> CentOS Linux release 7.8.2003 (Core)
>
> dpdk-18.11.5-1.el7_8.x86_64
>
> openvswitch-2.11.0-4.el7.x86_64
>
> libvirt 4.5.0
>
> QEMU 4.5.0 (API)
>
> QEMU 2.12.0
>
> 3.10.0-1127.13.1.el7.x86_64
>
> And we get the same crash
>
> #0 0x00007f96cb72e7ee in rte_memcpy_generic () from
> /lib64/librte_vhost.so.4
>
> #1 0x00007f96cb7350f2 in rte_vhost_dequeue_burst () from
> /lib64/librte_vhost.so.4
>
> #2 0x00007f96caf97f03 in netdev_dpdk_vhost_rxq_recv () from
> /lib64/libopenvswitch-2.11.so.0
>
> #3 0x00007f96caed21e6 in netdev_rxq_recv () from
> /lib64/libopenvswitch-2.11.so.0
>
> #4 0x00007f96caea07ca in dp_netdev_process_rxq_port () from
> /lib64/libopenvswitch-2.11.so.0
>
> #5 0x00007f96caea0ca5 in pmd_thread_main () from
> /lib64/libopenvswitch-2.11.so.0
>
> #6 0x00007f96caf2da3f in ovsthread_wrapper () from
> /lib64/libopenvswitch-2.11.so.0
>
> #7 0x00007f96c9ef3ea5 in start_thread () from /lib64/libpthread.so.0
>
> #8 0x00007f96c94118dd in clone () from /lib64/libc.so.6
>
> We have tried upgrading host level artifacts:
>
> dpdk-20.11.3-1.el7.x86_64
>
> openvswitch-2.16.1-1.el7.x86_64
>
> With backtrace:
>
> #0 0x00007f6b8b49748c in virtio_dev_tx_split_legacy () from
> /lib64/librte_vhost.so.21
>
> #1 0x00007f6b8b4c0fdb in rte_vhost_dequeue_burst () from
> /lib64/librte_vhost.so.21
>
> #2 0x000055bd714c2802 in netdev_dpdk_vhost_rxq_recv ()
>
> #3 0x000055bd713f8e51 in netdev_rxq_recv ()
>
> #4 0x000055bd713c9d2a in dp_netdev_process_rxq_port ()
>
> #5 0x000055bd713ca1f9 in pmd_thread_main ()
>
> #6 0x000055bd71455cdf in ovsthread_wrapper ()
>
> #7 0x00007f6b8a6a9ea5 in start_thread () from /lib64/libpthread.so.0
>
> #8 0x00007f6b89bc78dd in clone () from /lib64/libc.so.6
>
> Regards,
>
> Eran
>
More information about the dev
mailing list