<div dir="ltr"><div>Hello,</div><div><br></div><div>I created a Bugzilla PR, just as you requested:</div><div><a href="https://bugs.dpdk.org/show_bug.cgi?id=1536">https://bugs.dpdk.org/show_bug.cgi?id=1536</a></div><div><br></div><div>As for the bug resolution, I have other matters to attend to and I'm afraid I cannot spend more time on this issue, so I was only planning to report it.</div><div><br></div><div>Regards,<br>Edwin Brossette.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 6, 2024 at 1:16 PM Ferruh Yigit <<a href="mailto:ferruh.yigit@amd.com">ferruh.yigit@amd.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 9/5/2024 1:55 PM, Edwin Brossette wrote:<br>
> Hello,<br>
> <br>
> I have recently stumbled into an issue with my DPDK-based application<br>
> running the failsafe pmd. This pmd uses a tap device, with which my<br>
> application fails to start if more than 8 rx queues are used. This issue<br>
> appears to be related to this patch:<br>
> <a href="https://git.dpdk.org/dpdk/commit/" rel="noreferrer" target="_blank">https://git.dpdk.org/dpdk/commit/</a>?<br>
> id=c36ce7099c2187926cd62cff7ebd479823554929 <<a href="https://git.dpdk.org/dpdk/" rel="noreferrer" target="_blank">https://git.dpdk.org/dpdk/</a><br>
> commit/?id=c36ce7099c2187926cd62cff7ebd479823554929><br>
> <br>
> I have seen in the documentation that there was a limitation to 8 max<br>
> queues shared when using a tap device shared between multiple processes.<br>
> However, my application uses a single primary process, with no secondary<br>
> process, but it appears that I am still running into this limitation.<br>
> <br>
> Now if we look at this small chunk of code:<br>
> <br>
> memset(&msg, 0, sizeof(msg));<br>
> strlcpy(<a href="http://msg.name" rel="noreferrer" target="_blank">msg.name</a> <<a href="http://msg.name" rel="noreferrer" target="_blank">http://msg.name</a>>, TAP_MP_REQ_START_RXTX,<br>
> sizeof(<a href="http://msg.name" rel="noreferrer" target="_blank">msg.name</a> <<a href="http://msg.name" rel="noreferrer" target="_blank">http://msg.name</a>>));<br>
> strlcpy(request_param->port_name, dev->data->name, sizeof(request_param-<br>
>>port_name));<br>
> msg.len_param = sizeof(*request_param);<br>
> for (i = 0; i < dev->data->nb_tx_queues; i++) {<br>
> msg.fds[fd_iterator++] = process_private->txq_fds[i];<br>
> msg.num_fds++;<br>
> request_param->txq_count++;<br>
> }<br>
> for (i = 0; i < dev->data->nb_rx_queues; i++) {<br>
> msg.fds[fd_iterator++] = process_private->rxq_fds[i];<br>
> msg.num_fds++;<br>
> request_param->rxq_count++;<br>
> }<br>
> (Note that I am not using the latest DPDK version, but stable v23.11.1.<br>
> But I believe the issue is still present on latest.)<br>
> <br>
> There are no checks on the maximum value i can take in the for loops.<br>
> Since the size of msg.fds is limited by the maximum of 8 queues shared<br>
> between process because of the IPC API, there is a potential buffer<br>
> overflow which can happen here.<br>
> <br>
> See the struct declaration:<br>
> struct rte_mp_msg {<br>
> char name[RTE_MP_MAX_NAME_LEN];<br>
> int len_param;<br>
> int num_fds;<br>
> uint8_t param[RTE_MP_MAX_PARAM_LEN];<br>
> int fds[RTE_MP_MAX_FD_NUM];<br>
> };<br>
> <br>
> This means that if the number of queues used is more than 8, the program<br>
> will crash. This is what happens on my end as I get the following log:<br>
> *** stack smashing detected ***: terminated<br>
> <br>
> Reverting the commit mentionned above fixes my issue. Also setting a<br>
> check like this works for me:<br>
> <br>
> if (dev->data->nb_tx_queues + dev->data->nb_rx_queues > RTE_MP_MAX_FD_NUM)<br>
> return -1;<br>
> <br>
> I've made the changes on my local branch to fix my issue. This mail is<br>
> just to bring attention on this problem.<br>
> Thank you in advance for considering it.<br>
> <br>
<br>
Hi Edwin,<br>
<br>
Thanks for the report, I confirm issue is valid, although that code<br>
changed a little (to increase 8 limit) [3].<br>
<br>
And in this release Stephen put another patch [1] to increase the limit<br>
even more, but irrelevant from the limit, tap code needs to be fixed.<br>
<br>
To fix:<br>
1. We need to add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check you<br>
mentioned, to not blindly update the 'msg.fds[]'<br>
2. We should prevent this to be a limit for tap PMD when there is only<br>
primary process, this seems was oversight in our end.<br>
<br>
<br>
Can you work on the issue or just reporting it?<br>
Can you please report the bug in Bugzilla [2], to record the issue?<br>
<br>
<br>
<br>
[1]<br>
<a href="https://patches.dpdk.org/project/dpdk/patch/20240905162018.74301-1-stephen@networkplumber.org/" rel="noreferrer" target="_blank">https://patches.dpdk.org/project/dpdk/patch/20240905162018.74301-1-stephen@networkplumber.org/</a><br>
<br>
[2]<br>
<a href="https://bugs.dpdk.org/" rel="noreferrer" target="_blank">https://bugs.dpdk.org/</a><br>
<br>
[3]<br>
<a href="https://git.dpdk.org/dpdk/commit/?id=72ab1dc1598e" rel="noreferrer" target="_blank">https://git.dpdk.org/dpdk/commit/?id=72ab1dc1598e</a><br>
<br>
</blockquote></div>