<html>
    <head>
      <base href="https://bugs.dpdk.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8" class="bz_new_table">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - net/tap: crash in tap pmd when using more than RTE_MP_MAX_FD_NUM rx queues"
   href="https://bugs.dpdk.org/show_bug.cgi?id=1536">1536</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>net/tap: crash in tap pmd when using more than RTE_MP_MAX_FD_NUM rx queues
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DPDK
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>22.03
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>Normal
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>ethdev
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dev@dpdk.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>edwin.brossette@6wind.com
          </td>
        </tr>

        <tr>
          <th>Target Milestone</th>
          <td>---
          </td>
        </tr></table>
      <p>
        <div class="bz_comment_block">
          <pre class="bz_comment_text">Hello,

I have recently stumbled into an issue with my DPDK-based application running
the failsafe pmd. This pmd uses a tap device, with which my application fails
to start if more than 8 rx queues are used. This issue appears to be related to
this patch:
<a href="https://git.dpdk.org/dpdk/commit/?id=c36ce7099c2187926cd62cff7ebd479823554929">https://git.dpdk.org/dpdk/commit/?id=c36ce7099c2187926cd62cff7ebd479823554929</a>

I have seen in the documentation that there was a limitation to 8 max queues
shared when using a tap device shared between multiple processes. However, my
application uses a single primary process, with no secondary process, but it
appears that I am still running into this limitation.

Now if we look at this small chunk of code:

memset(&msg, 0, sizeof(msg));
strlcpy(msg.name, TAP_MP_REQ_START_RXTX, sizeof(msg.name));
strlcpy(request_param->port_name, dev->data->name,
sizeof(request_param->port_name));
msg.len_param = sizeof(*request_param);
for (i = 0; i < dev->data->nb_tx_queues; i++) {
    msg.fds[fd_iterator++] = process_private->txq_fds[i];
    msg.num_fds++;
    request_param->txq_count++;
}
for (i = 0; i < dev->data->nb_rx_queues; i++) {
    msg.fds[fd_iterator++] = process_private->rxq_fds[i];
    msg.num_fds++;
    request_param->rxq_count++;
}
(Note that I am not using the latest DPDK version, but stable v23.11.1. But I
believe the issue is still present on latest.)

There are no checks on the maximum value i can take in the for loops. Since the
size of msg.fds is limited by the maximum of 8 queues shared between process
because of the IPC API, there is a potential buffer overflow which can happen
here.

See the struct declaration:
struct rte_mp_msg {
     char name[RTE_MP_MAX_NAME_LEN];
     int len_param;
     int num_fds;
     uint8_t param[RTE_MP_MAX_PARAM_LEN];
     int fds[RTE_MP_MAX_FD_NUM];
};

This means that if the number of queues used is more than 8, the program will
crash. This is what happens on my end as I get the following log:
*** stack smashing detected ***: terminated

Reverting the commit mentioned above fixes my issue. Also setting a check like
this works for me:

if (dev->data->nb_tx_queues + dev->data->nb_rx_queues > RTE_MP_MAX_FD_NUM)
     return -1;

I've made the changes on my local branch to fix my issue.

----------

Potential fixes discussed: 

1. Add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check to not blindly update the
'msg.fds[]'

2. Prevent this to be a limit for tap PMD when there is only a primary process.
          </pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
      <div itemscope itemtype="http://schema.org/EmailMessage">
        <div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
          
          <link itemprop="url" href="https://bugs.dpdk.org/show_bug.cgi?id=1536">
          <meta itemprop="name" content="View bug">
        </div>
        <meta itemprop="description" content="Bugzilla bug update notification">
      </div>
    </body>
</html>