<html>
    <head>
      <base href="https://bugs.dpdk.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8" class="bz_new_table">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - examples/l3fwd: in event-mode hash.txadapter.txq is not always updated"
   href="https://bugs.dpdk.org/show_bug.cgi?id=1391">1391</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>examples/l3fwd: in event-mode hash.txadapter.txq is not always updated
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DPDK
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>Normal
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>examples
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dev@dpdk.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>konstantin.v.ananyev@yandex.ru
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>pbhagavatula@marvell.com
          </td>
        </tr>

        <tr>
          <th>Target Milestone</th>
          <td>---
          </td>
        </tr></table>
      <p>
        <div class="bz_comment_block">
          <pre class="bz_comment_text">Reproducible with latest main branch.

l3fwd in event-mode with SW with SW eventdev on mlx5 PMDs can crash:

./dpdk-l3fwd --lcores=49,51,53,55,57 -n 6 -a ca:00.0 -a ca:00.1 -a cb:00.0 -a
cb:00.1 -s 0x8000000000000 -\
-vdev event_sw0 -- -L -P -p f --rx-queue-size 1024 --tx-queue-size 1024 --mode
eventdev --eventq-sched=ordered \
--rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
--no-numa

Thread 4 "dpdk-worker51" received signal SIGSEGV, Segmentation fault.
0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=0x17f3ea780, buffer=0x10,
queue_id=43, port_id=1) at ../lib/ethdev/rte_ethdev.h:6637
6637            buffer->pkts[buffer->length++] = tx_pkt;
(gdb) bt
#0  0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=0x17f3ea780, buffer=0x10,
    queue_id=43, port_id=1) at ../lib/ethdev/rte_ethdev.h:6637
#1  txa_service_tx (txa=0x11f89959c0, ev=0x7ffff2f23e10, n=16)
    at ../lib/eventdev/rte_event_eth_tx_adapter.c:631
#2  0x000000000135d3ef in txa_service_func (args=0x11f89959c0)
    at ../lib/eventdev/rte_event_eth_tx_adapter.c:666
#3  0x00000000015d30e1 in service_runner_do_callback (s=0x11ffffe100,
    cs=0x11fffe8500, service_idx=2) at ../lib/eal/common/rte_service.c:405
#4  0x00000000015d3429 in service_run (i=2, cs=0x11fffe8500, service_mask=7,
    s=0x11ffffe100, serialize_mt_unsafe=1)
    at ../lib/eal/common/rte_service.c:441
#5  0x00000000015d363f in service_runner_func (arg=0x0)
    at ../lib/eal/common/rte_service.c:513
#6  0x00000000015c12c1 in eal_thread_loop (arg=0x33)
    at ../lib/eal/common/eal_common_thread.c:212
#7  0x00000000015e1b98 in eal_worker_thread_loop (arg=0x33)
    at ../lib/eal/linux/eal.c:916
#8  0x00007ffff5ff76ea in start_thread () from /lib64/libpthread.so.0
#9  0x00007ffff5d0fa8f in clone () from /lib64/libc.so.6

Obviously 'queue_id=43' is wrong here and it crashed while trying to access
un-configured TX queue. 

What is happening here is a coincidence of two different problems:
1. EVENT framework silently and un-conditionally re-uses mbuf::hash.fdir for
its own purposes:
                        struct {
                                uint32_t reserved1;
                                uint16_t reserved2;
                                uint16_t txq;
                                /**< The event eth Tx adapter uses this field
                                 * to store Tx queue id.
                                 * @see rte_event_eth_tx_adapter_txq_set()
                                 */
                        } txadapter; /**< Eventdev ethdev Tx adapter */
In particular txa_service_tx() expects hash.txadapter.txq to contain valid TX
queue index.
Though l3fwd not always set it properly.
Usually it is ok for that particular app, as only queue 0 is in use, and it
doesn't configure PMDs
to overwrite mbuf::hash.fdir.hi value (RTE_MBUF_F_RX_FDIR).
But if by whatever reason PMD will overwrite mbuf::hash.fdir.hi with some
non-zero value, then we are in trouble.
2. That's exactly what is happening here: mlx5 driver sometimes superfluously
updates  mbuf::hash.fdir.hi.

The fix I applied localy is obvious - *always* set hash.txadapter.txq to a
proper value before calling  rte_event_enqueue_burst().
See below for details.
Note that it is not the 'complete' fix, as same needs to be done for other
codepaths (em, fib, acl, etc.).
As a more general thing - I don't understand while EVENT framework keep using
hash.fdir for its own purposes.
Specially in a completely silent and unconditional way.
I think it would be much cleaner to switch to mbuf dynfiield/dynflag based
approach.

diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index a484a33089..ef9838aef3 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -285,6 +285,8 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
                        continue;
                }

+               rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
+
                if (flags & L3FWD_EVENT_TX_ENQ) {
                        ev.queue_id = tx_q_id;
                        ev.op = RTE_EVENT_OP_FORWARD;
@@ -295,7 +297,6 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
                }

                if (flags & L3FWD_EVENT_TX_DIRECT) {
-                       rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
                        do {
                                enq = rte_event_eth_tx_adapter_enqueue(
                                        event_d_id, event_p_id, &ev, 1, 0);
@@ -344,11 +345,8 @@ lpm_event_loop_burst(struct l3fwd_event_resources
*evt_rsrc,
                                events[i].op = RTE_EVENT_OP_FORWARD;
                        }

-                       if (flags & L3FWD_EVENT_TX_DIRECT)
-                              
rte_event_eth_tx_adapter_txq_set(events[i].mbuf,
-                                                                0);
-
                        lpm_process_event_pkt(lconf, events[i].mbuf);
+                       rte_event_eth_tx_adapter_txq_set(events[i].mbuf, 0);
                }

                if (flags & L3FWD_EVENT_TX_ENQ) {
          </pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
      <div itemscope itemtype="http://schema.org/EmailMessage">
        <div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
          
          <link itemprop="url" href="https://bugs.dpdk.org/show_bug.cgi?id=1391">
          <meta itemprop="name" content="View bug">
        </div>
        <meta itemprop="description" content="Bugzilla bug update notification">
      </div>
    </body>
</html>