[dpdk-dev] [Bug 532] af_xdp: kernel panic when freeing mbufs on lcore other than the receiving lcore

bugzilla at dpdk.org bugzilla at dpdk.org
Tue Sep 1 09:55:22 CEST 2020


https://bugs.dpdk.org/show_bug.cgi?id=532

            Bug ID: 532
           Summary: af_xdp: kernel panic when freeing mbufs on lcore other
                    than the receiving lcore
           Product: DPDK
           Version: 20.08
          Hardware: x86
                OS: Linux
            Status: UNCONFIRMED
          Severity: major
          Priority: Normal
         Component: ethdev
          Assignee: dev at dpdk.org
          Reporter: martin.weiser at allegro-packets.com
  Target Milestone: ---

Created attachment 119
  --> https://bugs.dpdk.org/attachment.cgi?id=119&action=edit
patch for distributor example application to reproduce the issue

We are having an issue with our DPDK application when using the af_xdp driver
and USB ethernet devices.
As soon as a higher packet rate is to be processed, the Linux kernel (5.7.6)
will panic in the xsk_generic_rcv function (please see attached the kernel
output pasted below).

I have attached a patch that modifies the distributor example in a way that
will work with af_xdp devices (only use a single tx queue) and instead of
processing the packets will just free them in the worker lcore. When putting
about a gig of traffic on the interfaces the kernel will immediately panic. We
start the modified distributor application like this:

./build/app/distributor_app -c 0x1f --vdev net_af_xdp0,iface=eth5 --vdev
net_af_xdp1,iface=eth6 -- -p 0x03



kernel panic output:


[  256.427389] #PF: supervisor write access in kernel mode
[  256.490000] #PF: error_code(0x0002) - not-present page
[  256.490002] PGD 265628067 P4D 265628067 PUD 0 
[  256.490008] Oops: 0002 [#1] SMP NOPTI
[  256.490012] CPU: 1 PID: 1458 Comm: lcore-slave-1 Tainted: G           OE    
5.7.0-1-amd64 #1 Debian 5.7.6-1
[  256.490013] Hardware name: Supermicro SYS-E50-9AP-N5-AG050/A2SAP-H, BIOS 1.4
04/17/2020
[  256.490022] RIP: 0010:memcpy_erms+0x6/0x10
[  256.490025] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03
83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3
0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[  256.490026] RSP: 0000:ffffa06c000d4c98 EFLAGS: 00010202
[  256.490028] RAX: 0000000000000640 RBX: ffffa06c000d4d60 RCX:
0000000000000041
[  256.490029] RDX: 0000000000000041 RSI: ffff8c6b63e0143a RDI:
0000000000000640
[  256.490030] RBP: ffff8c6b657a5000 R08: 0000ffffffffffff R09:
ffff8c6935b19180
[  256.490031] R10: 0000000000008000 R11: ffff8c6b610e0000 R12:
0000000000000000
[  256.490033] R13: ffff8c6b657a535c R14: 01400000093fe500 R15:
0000000000000041
[  256.490034] FS:  00007fbafecfc400(0000) GS:ffff8c6b77c80000(0000)
knlGS:0000000000000000
[  256.490036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  256.490037] CR2: 0000000000000640 CR3: 0000000263aa2000 CR4:
00000000003406e0
[  256.490038] Call Trace:
[  256.490042]  <IRQ>
[  256.490048]  xsk_generic_rcv+0x190/0x2a0
[  256.490057]  xdp_do_generic_redirect+0x1f2/0x2e0
[  258.033361]  do_xdp_generic.part.0+0x313/0x4a0
[  258.033366]  __netif_receive_skb_core+0x1b5/0x1050
[  258.033369]  ? netif_rx_internal+0x41/0x100
[  258.033372]  __netif_receive_skb_one_core+0x3d/0xa0
[  258.033375]  process_backlog+0xa4/0x160
[  258.033377]  net_rx_action+0x148/0x3c0
[  258.033382]  __do_softirq+0xe6/0x2e9
[  258.033387]  ? handle_irq_event_percpu+0x72/0x80
[  258.033390]  irq_exit+0xa6/0xb0
[  258.033392]  do_IRQ+0x58/0xe0
[  258.033395]  common_interrupt+0xf/0xf
[  258.033402]  </IRQ>
[  258.583925] RIP: 0033:0x559c289a008b
[  258.583929] Code: 0f 87 59 02 00 00 85 d2 be 40 00 00 00 0f 84 54 01 00 00
44 8b a3 00 01 00 00 8b 83 84 00 00 00 ba 40 00 00 00 89 c5 44 29 e5 <83> fd 40
0f 47 ea 41 39 c4 74 ad 41 8d 0c 2c 89 8b 00 01 00 00 44
[  258.583930] RSP: 002b:00007fbafecf8660 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffffda
[  258.583933] RAX: 000000000000ff35 RBX: 0000000100214000 RCX:
00007fbafecf8588
[  258.583934] RDX: 0000000000000040 RSI: 0000000000000040 RDI:
0000000100230f00
[  258.583935] RBP: 0000000000000000 R08: 0000000100230f00 R09:
ffffffffffffd808
[  258.583936] R10: ffffffff00000000 R11: 000000000001fe3e R12:
000000000000ff35
[  258.583937] R13: 00007fbafecf8690 R14: 00007fbafecf8f00 R15:
0000559c29969160
[  258.583940] Modules linked in: ipmi_devintf ipmi_msghandler igb_uio(OE) uio
ccm algif_aead cbc des_generic libdes ecb arc4 algif_skcipher cmac sha512_ssse3
sha512_generic md4 algif_hash af_alg snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic nls_ascii nls_cp437 intel_rapl_msr intel_rapl_common vfat
fat x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass i915 rtl8192cu
rtl_usb rtl8192c_common ghash_clmulni_intel rtlwifi snd_sof_pci
snd_sof_intel_hda_common mac80211 snd_sof_intel_hda snd_sof_intel_byt
snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp ledtrig_audio snd_soc_skl
snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp
snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core cfg80211 snd_compress
snd_hda_intel snd_intel_dspcfg aesni_intel snd_hda_codec snd_hda_core libaes
crypto_simd cryptd efi_pstore glue_helper intel_cstate intel_rapl_perf pcspkr
snd_hwdep efivars aqc111(OE) rfkill cdc_ether snd_pcm usbnet mii libarc4 evdev
joydev drm_kms_helper snd_timer snd
[  258.583993]  mei_me sg soundcore cec mei tpm_crb tpm_tis tpm_tis_core button
tpm rng_core pkcs8_key_parser ib_iser rdma_cm iw_cm ib_cm ib_core configfs
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse drm efivarfs
ip_tables x_tables autofs4 hid_generic usbhid hid ext4 crc16 mbcache jbd2
crc32c_generic sd_mod t10_pi crc_t10dif crct10dif_generic xhci_pci ahci libahci
xhci_hcd spi_pxa2xx_platform dw_dmac dw_dmac_core libata usbcore
crct10dif_pclmul crct10dif_common crc32_pclmul scsi_mod crc32c_intel igb
i2c_i801 lpc_ich intel_lpss_pci intel_lpss idma64 mfd_core usb_common
i2c_algo_bit dca ptp pps_core fan video
[  260.415193] CR2: 0000000000000640
[  260.415200] ---[ end trace 21af094956430aad ]---
[  262.321592] RIP: 0010:memcpy_erms+0x6/0x10
[  263.078482] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03
83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3
0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[  263.078484] RSP: 0000:ffffa06c000d4c98 EFLAGS: 00010202
[  263.078487] RAX: 0000000000000640 RBX: ffffa06c000d4d60 RCX:
0000000000000041
[  263.078488] RDX: 0000000000000041 RSI: ffff8c6b63e0143a RDI:
0000000000000640
[  263.078489] RBP: ffff8c6b657a5000 R08: 0000ffffffffffff R09:
ffff8c6935b19180
[  263.078490] R10: 0000000000008000 R11: ffff8c6b610e0000 R12:
0000000000000000
[  263.078491] R13: ffff8c6b657a535c R14: 01400000093fe500 R15:
0000000000000041
[  263.078493] FS:  00007fbafecfc400(0000) GS:ffff8c6b77c80000(0000)
knlGS:0000000000000000
[  263.078494] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  263.078495] CR2: 0000000000000640 CR3: 0000000263aa2000 CR4:
00000000003406e0
[  263.078498] Kernel panic - not syncing: Fatal exception in interrupt
[  263.078522] Kernel Offset: 0x10000000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
[  264.393968] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the dev mailing list