[dpdk-dev] [RFC 0/8] mbuf: structure reorganization

Bruce Richardson bruce.richardson at intel.com
Thu Feb 16 16:46:19 CET 2017


On Thu, Feb 16, 2017 at 02:48:07PM +0100, Olivier Matz wrote:
> Hi Konstantin,
> 
> Thanks for the feedback.
> Comments inline.
> 
> 
> On Mon, 6 Feb 2017 18:41:27 +0000, "Ananyev, Konstantin"
> <konstantin.ananyev at intel.com> wrote:
> > Hi Olivier,
> > Looks good in general, some comments from me below.
> > Thanks
> > Konstantin
> > 
> > > 
> > > The main changes are:
> > > - reorder structure to increase vector performance on some non-ia
> > >   platforms.
> > > - add a 64bits timestamp field in the 1st cache line  
> > 
> > Wonder why it deserves to be in first cache line?
> > How it differs from seqn below (pure SW stuff right now).
> 
> In case the timestamp is set from a NIC value, it is set in the Rx
> path. So that's why I think it deserve to be located in the 1st cache
> line.
> 
> As you said, the seqn is a pure sw stuff right: it is set in a lib, not
> in a PMD rx path.
> 
> > > - m->next, m->nb_segs, and m->refcnt are always initialized for
> > > mbufs in the pool, avoiding the need of setting m->next (located in
> > > the 2nd cache line) in the Rx path for mono-segment packets.
> > > - change port and nb_segs to 16 bits  
> > 
> > Not that I am completely against it,
> > but changing nb_segs to 16 bits seems like an overkill to me.
> > I think we can keep and extra 8bits for something more useful in
> > future.
> 
> In my case, I use the m->next field to chain more than 256 segments for
> L4 socket buffers. It also updates nb_seg that can overflow. It's not
> a big issue since at the end, nb_seg is decremented for each segment.
> On the other hand, if I enable some sanity checks on mbufs, it
> complains because the number of segments is not equal to nb_seg.
> 
> There is also another use case with fragmentation as discussed recently:
> http://dpdk.org/dev/patchwork/patch/19819/
> 
> Of course, dealing with a long mbuf list is not that efficient,
> but the application can maintain another structure to accelerate the
> access to the middle/end of the list.
> 
> Finally, we have other ideas to get additional 8 bits if required in
> the future, so I don't think it's really a problem.
> 
> 
> > 
> > > - move seqn in the 2nd cache line
> > > 
> > > Things discussed but not done in the patchset:
> > > - move refcnt and nb_segs to the 2nd cache line: many drivers sets
> > >   them in the Rx path, so it could introduce a performance
> > > regression, or  
> > 
> > I wonder can refcnt only be moved into the 2-nd cacheline?
> > As I understand thanks to other change (from above) m->refcnt 
> > will already be initialized, so RX code don't need to touch it.
> > Though yes, it still would require changes in all PMDs.
> 
> Yes, I agree, some fields could be moved in the 2nd cache line once all
> PMDs stop to write them in RX path. I propose to issue some guidelines
> to PMD maintainers at the same time the patchset is pushed. Then we can
> consider changing it in a future version, in case we need more room in
> the 1st mbuf cache line.
>

If we are changing things, we should really do all that now, rather than
storing up future breaks to mbuf. Worst case, we should plan for it
immediately after the release where we make these changes. Have two
releases that break mbuf immediately after each other - and flagged as
such, but keep it stable thereafter. I don't like having technical debt
on mbuf just after we supposedly "fix" it.

/Bruce


More information about the dev mailing list