[dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

Ananyev, Konstantin konstantin.ananyev at intel.com
Wed Feb 14 13:35:19 CET 2018



> -----Original Message-----
> From: Richardson, Bruce
> Sent: Wednesday, February 14, 2018 12:12 PM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> Cc: Yongseok Koh <yskoh at mellanox.com>; Olivier Matz <olivier.matz at 6wind.com>; dev at dpdk.org
> Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> 
> On Wed, Feb 14, 2018 at 12:03:55PM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> > > Sent: Wednesday, February 14, 2018 11:48 AM
> > > To: Yongseok Koh <yskoh at mellanox.com>; Olivier Matz <olivier.matz at 6wind.com>
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> > >
> > > Hi Yongseok,
> > >
> > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh <yskoh at mellanox.com> wrote:
> > > > >
> > > > > Hi Olivier
> > > > >
> > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems beneficial
> > > > > to the cases where almost mbufs have single segment.
> > > > >
> > > > > A customer reported high rate of cache misses in the code and I thought the
> > > > > following patch could be helpful. I haven't had them try it yet but just wanted
> > > > > to hear from you.
> > > > >
> > > > > I'd appreciate if you can review this idea.
> > > > >
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > index 62740254d..96edbcb9e 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > >                if (RTE_MBUF_INDIRECT(m))
> > > > >                        rte_pktmbuf_detach(m);
> > > > >
> > > > > -               if (m->next != NULL) {
> > > > > +               if (m->nb_segs > 1) {
> > > > >                        m->next = NULL;
> > > > >                        m->nb_segs = 1;
> > > > >                }
> > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > >                if (RTE_MBUF_INDIRECT(m))
> > > > >                        rte_pktmbuf_detach(m);
> > > > >
> > > > > -               if (m->next != NULL) {
> > > > > +               if (m->nb_segs > 1) {
> > > > >                        m->next = NULL;
> > > > >                        m->nb_segs = 1;
> > > > >                }
> > > >
> > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order to put it back to the mempool.
> > > > It looks like the cache miss is unavoidable.
> > >
> > > As a thought: in theory PMD can store pool pointer together with each mbuf it has to free,
> > > then it could be something like:
> > >
> > > if (rte_pktmbuf_prefree_seg(m[x] != NULL)
> > >    rte_mempool_put(pool[x], m[x]);
> > >
> > > Then what you suggested above might help.
> >
> > After another thought - we have to check m->next not m->nb_segs.
> > There could be a situations where nb_segs==1, but m->next != NULL
> > (2-nd segment of the 3 segment packet for example).
> > So probably we have to keep it as it is.
> > Sorry for the noise
> > Konstantin
> 
> It's still worth considering as an option. We could check nb_segs for
> the first segment of a packet and thereafter iterate using the next
> pointer.

In multi-seg case PMD frees segments (not packets).
It could happen that first segment would be already freed while the second 
still not.

> It means that your idea of storing the pool pointer for each
> mbuf becomes useful for single-segment packets.

But then we'll have to support 2 different flavors of prefree_seg().
Alternative would be to change all PMDs multi-seg TX so when first segment is 
going to be freed we update nb_segs for the second and so on.
Both options seems like too much hassle.

Konstantin


More information about the dev mailing list