[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
    Bruce Richardson 
    bruce.richardson at intel.com
       
    Thu Jun  9 16:19:48 CEST 2016
    
    
  
On Thu, Jun 09, 2016 at 01:21:18PM +0000, Ananyev, Konstantin wrote:
> Hi Olivier,
> 
> > -----Original Message-----
> > From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> > Sent: Thursday, June 09, 2016 8:47 AM
> > To: Ananyev, Konstantin; dev at dpdk.org; Adrien Mazarguil
> > Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
> > 
> > Hi Konstantin,
> > 
> > >>>>>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg().
> > >>>>>> Wright now, it is a user responsibility to make sure refcnt==0 before pushing
> > >>>>>> mbuf back to the pool.
> > >>>>>> Not sure why do you consider that wrong?
> > >>>>>
> > >>>>> I do not consider this wrong and I'm all for using assert() to catch
> > >>>>> programming errors, however in this specific case, I think they are
> > >>>>> inconsistent and misleading.
> > >>>>
> > >>>> Honestly, I don't understand why.
> > >>>> Right now the rule of thumb is - when mbuf is in the pool, it's refcnt should be equal zero.
> > >>
> > >> What is the purpose of this? Is there some code relying on this?
> > >
> > > The whole current implementation of mbuf_free code path relies on that.
> > > Straight here:
> > > if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m)))) {
> > >                 m->next = NULL;
> > >                 __rte_mbuf_raw_free(m);
> > >         }
> > >
> > > If we'll exclude indirect mbuf logic, all it does:
> > > if (rte_mbuf_refcnt_update(m, -1) == 0) {
> > > 	m->next = NULL;
> > > 	__rte_mbuf_raw_free(m);
> > > }
> > >
> > > I.E.:
> > > decrement mbuf->refcnt.
> > > If new value of refcnt is zero, then put it back into the pool.
> > >
> > > So having ASERT(mbuf->refcnt==0) inside
> > > __rte_mbuf_raw_free()/__rte_mbuf_raw_alloc()
> > > looks absolutely valid to me.
> > > I *has* to be zero at that point with current implementation,
> > > And if it is not then we probably have (or will have) a silent memory corruption.
> > 
> > This explains how the refcount is used, and why it is set
> > to zero before returning the mbuf to the pool with the mbuf
> > free functions.
> 
> From my point, that shows that rte_pktmbuf_free() relies on the value of the refcnt
> to make a decision is it ok to put mbuf back to the pool or not.
> Right now it puts mbuf to the pool *only* if it's refcnt==0.
> As discussed below, we probably can change it to be refcnt==1
> (if there really would be noticeable performance gain).
> But I think it still should be just one predefined value of refcnt (0 or 1).
> In theory it is possible to allow both (0 and 1),
> but that will make it hard to debug any alloc/free issues,
> plus would neglect any possible performance gain -
> as in that case raw_alloc (or it's analog) should still do
> mbuf->refcnt=1; 
> 
I think enforcing the rule of the refcnt being 1 inside the mempool is reasonable.
It saves setting the flag to zero and back to one again in the normal case. The
one case, that I can think of, where it does cause extra work is when two
threads simultaneously do the atomic decrement of the refcnt and one of them
gets zero and must free the buffer. However, in that case, the additional cost
of setting the count back to 1 on free is far less than the penalty paid on the
atomic decrement - and this should definitely be an edge-case anyway, since it
requires the two threads to free the same mbuf at the same time.
If we make this change, it also may open the possibility to look at moving the
refcount to cacheline 1 if we are reorganising the mbuf. It would no longer be
needed inside RX in our PMDs, and the path to free mbufs already uses the pool
pointer from the second cache-line anyway. That would free up 16-bits of space
to expand out the nb_segs and port values to 16-bit each if we wanted.
Regards,
/Bruce
    
    
More information about the dev
mailing list