[dpdk-dev] [PATCH v4] node: switch IPv4 metadata to dynamic mbuf field
Van Haaren, Harry
harry.van.haaren at intel.com
Thu Oct 29 11:17:18 CET 2020
> -----Original Message-----
> From: Thomas Monjalon <thomas at monjalon.net>
> Sent: Wednesday, October 28, 2020 6:08 PM
> To: Nithin Dabilpuram <ndabilpuram at marvell.com>; Van Haaren, Harry
> <harry.van.haaren at intel.com>
> Cc: dev at dpdk.org; Pavan Nikhilesh <pbhagavatula at marvell.com>; Jerin Jacob
> <jerinj at marvell.com>; Ruifeng Wang <ruifeng.wang at arm.com>; Richardson, Bruce
> <bruce.richardson at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; kirankumark at marvell.com; dev at dpdk.org;
> david.marchand at redhat.com; olivier.matz at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v4] node: switch IPv4 metadata to dynamic mbuf
> field
>
> 28/10/2020 11:24, Van Haaren, Harry:
> > From: Thomas Monjalon
> > > > + IP4_LOOKUP_NODE_PRIV1_OFF(node->ctx) =
> node_mbuf_priv1_dynfield_offset;
> > >
> > > That's interesting.
> > > You copy the offset in the node context for better performance.
> > > How much is it better than with global offset variable?
> > > How much it decreases compared to a static mbuf field?
> >
> > Also interested in this topic, I'll offer the logical/theory point of view;
> >
> > With a static field, the offset into the mbuf can be encoded in the instruction
> > stream, meaning there are no d-cache loads to identify particular dynamic field.
> >
> > With a static/global variable, the cache line where the value resides is presumably
> > not hot in cache per burst (assuming an application that does significant work, so
> not
> > in cache since last burst). Hence overhead estimate could be 1x cache line load per
> burst.
>
> Would it help to group all dynfields and dynflags offsets
> in the same cache line?
It could - but if/how-much it would benefit depends on the workload I think.
Using each cache line fully is always good, so if grouping the offsets together is
reasonable to do, it seems a good idea.
My assumptions is that registration of dynamic fields/flags is expected at init time,
and that the values remain constant at runtime. That would make this a cache-line
in "shared" state in each core that uses the dynfields of mbuf.
Overall, it is unlikely to have much impact on a real-world application.. but DPDK
puts performance first! And packing a single cache-line full of hot data is best practice :)
More information about the dev
mailing list