<div dir="ltr"><div>Hello Oliver,</div><div><br></div><div>thanks, your response helped a lot, I managed to find the root cause of the instability which is on our side.</div><div>It was due to other internal developments.</div><div>I'll still add an error check on the enqueue ops to catch eventual problems earlier, if that suits you.</div><div><br></div><div>Best regards,</div><div><br></div><div>Julien<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 22, 2023 at 10:34 AM Olivier Matz <<a href="mailto:olivier.matz@6wind.com">olivier.matz@6wind.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello Julien,<br>
<br>
On Tue, Aug 22, 2023 at 08:34:53AM +0200, jhascoet wrote:<br>
> From: Julien Hascoet <<a href="mailto:ju.hascoet@gmail.com" target="_blank">ju.hascoet@gmail.com</a>><br>
> <br>
> In case of ring full state, we retry the enqueue<br>
> operation in order to avoid mbuf loss.<br>
> <br>
> Fixes: af75078fece ("first public release")<br>
> <br>
> Signed-off-by: Julien Hascoet <<a href="mailto:ju.hascoet@gmail.com" target="_blank">ju.hascoet@gmail.com</a>><br>
> ---<br>
> app/test/test_mbuf.c | 15 ++++++++++++---<br>
> 1 file changed, 12 insertions(+), 3 deletions(-)<br>
> <br>
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c<br>
> index efac01806b..ad18bf6378 100644<br>
> --- a/app/test/test_mbuf.c<br>
> +++ b/app/test/test_mbuf.c<br>
> @@ -1033,12 +1033,21 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,<br>
> tref += ref;<br>
> if ((ref & 1) != 0) {<br>
> rte_pktmbuf_refcnt_update(m, ref);<br>
> - while (ref-- != 0)<br>
> - rte_ring_enqueue(refcnt_mbuf_ring, m);<br>
> + while (ref-- != 0) {<br>
> + /* retry in case of failure */<br>
> + while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {<br>
> + /* let others consume */<br>
> + rte_pause();<br>
> + }<br>
> + }<br>
> } else {<br>
> while (ref-- != 0) {<br>
> rte_pktmbuf_refcnt_update(m, 1);<br>
> - rte_ring_enqueue(refcnt_mbuf_ring, m);<br>
> + /* retry in case of failure */<br>
> + while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {<br>
> + /* let others consume */<br>
> + rte_pause();<br>
> + }<br>
> }<br>
> }<br>
> rte_pktmbuf_free(m);<br>
> -- <br>
> 2.34.1<br>
> <br>
<br>
Can you give some more details about how to reproduce the issue?<br>
<br>
>From what I see, the code does the following:<br>
<br>
main core:<br>
create a ring with at least (REFCNT_MBUF_NUM * REFCNT_MAX_REF) entries<br>
create an mbuf pool with REFCNT_MBUF_NUM entries<br>
start worker cores<br>
do REFCNT_MAX_ITER times:<br>
for each mbuf of the pool (REFCNT_MBUF_NUM entries):<br>
let r be a random number between 1 and REFCNT_MAX_REF<br>
increase mbuf references by r, and enqueue r times in the ring<br>
wait that the ring is empty (since worker cores are dequeuing mbufs)<br>
stop worker cores<br>
<br>
worker cores:<br>
dequeue packets from the ring and free them until asked to stop<br>
<br>
<br>
I may be mistaking but I don't see how the number of mbufs in ring could<br>
exceed REFCNT_MBUF_NUM * REFCNT_MAX_REF.<br>
<br>
Regards,<br>
Olivier<br>
<br>
<br>
Note: removing CC <a href="mailto:maintainers@dpdk.org" target="_blank">maintainers@dpdk.org</a><br>
</blockquote></div>