[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hanoch Haim (hhaim) hhaim at cisco.com
Tue Jan 5 12:11:24 CET 2016


Hi Oliver, 
Thank you for the fast response and it would be great to open a discussion on that.
In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described.
However, for me it  seems odd that  rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel.
The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) and I don't see why it wouldn't work after your optimization. 

Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread?

Thanks,
Hanoh

-----Original Message-----
From: Olivier MATZ [mailto:olivier.matz at 6wind.com] 
Sent: Tuesday, January 05, 2016 12:58 PM
To: Hanoch Haim (hhaim); bruce.richardson at intel.com
Cc: dev at dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question The use 
> case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect 
> m1/m2 do alloc/attach/send
>
>      core0			             |	core1
> ---------------------------------                         |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)             |
>                                                                    |
> while true:                                                 |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                           |    tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on several cores. By "used", I mean calling a function that modifies the mbuf, which is the case for rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>    core0			                    |	core1
> ---------------------------------                  |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)      |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
>                                                             |
> while true:                                          |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                     |    tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to protect the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function (should concurrent calls be allowed or not). In any case, we may want to better document it in the doxygen API comments.


Regards,
Olivier


More information about the dev mailing list