<html>
<head>
<base href="https://bugs.dpdk.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8" class="bz_new_table">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_UNCONFIRMED "
title="UNCONFIRMED - memif disconnect thread vs. rx_burst worker thread(s) crash"
href="https://bugs.dpdk.org/show_bug.cgi?id=1298">1298</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>memif disconnect thread vs. rx_burst worker thread(s) crash
</td>
</tr>
<tr>
<th>Product</th>
<td>DPDK
</td>
</tr>
<tr>
<th>Version</th>
<td>21.11
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>UNCONFIRMED
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>Normal
</td>
</tr>
<tr>
<th>Component</th>
<td>ethdev
</td>
</tr>
<tr>
<th>Assignee</th>
<td>dev@dpdk.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>bly454@gmail.com
</td>
</tr>
<tr>
<th>Target Milestone</th>
<td>---
</td>
</tr></table>
<p>
<div class="bz_comment_block">
<pre class="bz_comment_text">Hello,
We have run into a timing issue between threads when using the memif interface
type and need some guidance.
Our application has a DPDK based process operating (among other things) a memif
server interface. The problem is exposed when this memif interface receives a
memif.disconnect message from the remote client, while in the middle of an
rte_eth_rx_burst() on this same memif interface. As the IRQ message handling is
on its own thread as compared to the DPDK worker thread doing the rx_burst,
this resulted in a crash. The backtraces for which have been shared below.
How does one ensure there are guard rails in place to gracefully exit the
rx-burst when a disconnect occurs? Or, how do we properly modify the code such
that we defer responding to the disconnect CB after the rx-burst operation has
completed?
We are utilizing DPDK 21.11.2. I have diff’d dpdks-stable:22.11.3 in
./drivers/net/memif, but I do not see anything obvious that would address this.
I did a similar diff for dpdk:23.07, but do not see anything obvious there
either.
-Mike
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f17e2813600 (LWP 470))]
#0 0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00,
bufs=0x7f17e28100e8, nb_pkts=32) at
../git/drivers/net/memif/rte_eth_memif.c:338
338 last_slot = __atomic_load_n(&ring->head,
__ATOMIC_ACQUIRE);
(gdb) bt
#0 0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00,
bufs=0x7f17e28100e8, nb_pkts=32) at
../git/drivers/net/memif/rte_eth_memif.c:338
#1 0x000000000047e6fb in rte_eth_rx_burst (nb_pkts=32, rx_pkts=0x7f17e28100e8,
queue_id=0, port_id=<optimized out>) at /usr/include/rte_ethdev.h:5368
#2 pmd_main_loop () at ../git/swfw/api/src/swfwPmd.c:1086
#3 0x000000000047f309 in pmd_launch_one_lcore (dummy=<optimized out>) at
../git/my_process.c:1157
#4 0x00007f17f7070e7c in eal_thread_loop (arg=<optimized out>) at
../git/lib/eal/linux/eal_thread.c:146
#5 0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at
pthread_create.c:442
#6 0x00007f17f4cbf930 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) l
333 ring_size = 1 << mq->log2_ring_size;
334 mask = ring_size - 1;
335
336 if (type == MEMIF_RING_C2S) {
337 cur_slot = mq->last_head;
338 last_slot = __atomic_load_n(&ring->head,
__ATOMIC_ACQUIRE);
339 } else {
340 cur_slot = mq->last_tail;
341 last_slot = __atomic_load_n(&ring->tail,
__ATOMIC_ACQUIRE);
342 }
(gdb) p ring->head
Cannot access memory at address 0x7f17d8e58006
(gdb) thread 19
[Switching to thread 19 (Thread 0x7f17f0804600 (LWP 468))]
#0 0x00007f17f4caf97b in __GI___close (fd=494) at
../sysdeps/unix/sysv/linux/close.c:27
27 return SYSCALL_CANCEL (close, fd);
(gdb) bt
#0 0x00007f17f4caf97b in __GI___close (fd=494) at
../sysdeps/unix/sysv/linux/close.c:27
#1 0x00007f17e374f01f in memif_free_regions (dev=dev@entry=0x7f17f727f000
<rte_eth_devices+99072>) at ../git/drivers/net/memif/rte_eth_memif.c:882
#2 0x00007f17e37475d0 in memif_disconnect (dev=0x7f17f727f000
<rte_eth_devices+99072>) at ../git/drivers/net/memif/memif_socket.c:623
#3 0x00007f17f7091bd2 in eal_intr_process_interrupts (nfds=<optimized out>,
events=<optimized out>) at ../git/lib/eal/linux/eal_interrupts.c:1026
#4 eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=20) at
../git/lib/eal/linux/eal_interrupts.c:1100
#5 eal_intr_thread_main (arg=<optimized out>) at
../git/lib/eal/linux/eal_interrupts.c:1172
#6 0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at
pthread_create.c:442
#7 0x00007f17f4cbf930 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="https://bugs.dpdk.org/show_bug.cgi?id=1298">
<meta itemprop="name" content="View bug">
</div>
<meta itemprop="description" content="Bugzilla bug update notification">
</div>
</body>
</html>