[dpdk-dev] [PATCH 1/6] ring: change head and tail to pointer-width size
Stephen Hemminger
stephen at networkplumber.org
Fri Jan 11 20:55:15 CET 2019
On Fri, 11 Jan 2019 19:12:40 +0000
"Eads, Gage" <gage.eads at intel.com> wrote:
> > -----Original Message-----
> > From: Burakov, Anatoly
> > Sent: Friday, January 11, 2019 4:25 AM
> > To: Eads, Gage <gage.eads at intel.com>; dev at dpdk.org
> > Cc: olivier.matz at 6wind.com; arybchenko at solarflare.com; Richardson, Bruce
> > <bruce.richardson at intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev at intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 1/6] ring: change head and tail to pointer-width
> > size
> >
> > On 10-Jan-19 9:01 PM, Gage Eads wrote:
> > > For 64-bit architectures, doubling the head and tail index widths
> > > greatly increases the time it takes for them to wrap-around (with
> > > current CPU speeds, it won't happen within the author's lifetime).
> > > This is important in avoiding the ABA problem -- in which a thread
> > > mistakes reading the same tail index in two accesses to mean that the
> > > ring was not modified in the intervening time -- in the upcoming
> > > non-blocking ring implementation. Using a 64-bit index makes the possibility of
> > this occurring effectively zero.
> > >
> > > I tested this commit's performance impact with an x86_64 build on a
> > > dual-socket Xeon E5-2699 v4 using ring_perf_autotest, and the change
> > > made no significant difference -- the few differences appear to be system
> > noise.
> > > (The test ran on isolcpus cores using a tickless scheduler, but some
> > > variation was stll observed.) Each test was run three times and the
> > > results were averaged:
> > >
> > > | 64b head/tail cycle cost minus
> > > Test | 32b head/tail cycle cost
> > > ------------------------------------------------------------------
> > > SP/SC single enq/dequeue | 0.33
> > > MP/MC single enq/dequeue | 0.00
> > > SP/SC burst enq/dequeue (size 8) | 0.00 MP/MC burst enq/dequeue (size
> > > 8) | 1.00 SP/SC burst enq/dequeue (size 32) | 0.00 MP/MC burst
> > > enq/dequeue (size 32) | -1.00
> > > SC empty dequeue | 0.01
> > > MC empty dequeue | 0.00
> > >
> > > Single lcore:
> > > SP/SC bulk enq/dequeue (size 8) | -0.36
> > > MP/MC bulk enq/dequeue (size 8) | 0.99
> > > SP/SC bulk enq/dequeue (size 32) | -0.40 MP/MC bulk enq/dequeue (size
> > > 32) | -0.57
> > >
> > > Two physical cores:
> > > SP/SC bulk enq/dequeue (size 8) | -0.49
> > > MP/MC bulk enq/dequeue (size 8) | 0.19
> > > SP/SC bulk enq/dequeue (size 32) | -0.28 MP/MC bulk enq/dequeue (size
> > > 32) | -0.62
> > >
> > > Two NUMA nodes:
> > > SP/SC bulk enq/dequeue (size 8) | 3.25
> > > MP/MC bulk enq/dequeue (size 8) | 1.87
> > > SP/SC bulk enq/dequeue (size 32) | -0.44 MP/MC bulk enq/dequeue (size
> > > 32) | -1.10
> > >
> > > An earlier version of this patch changed the head and tail indexes to
> > > uint64_t, but that caused a performance drop on 32-bit builds. With
> > > uintptr_t, no performance difference is observed on an i686 build.
> > >
> > > Signed-off-by: Gage Eads <gage.eads at intel.com>
> > > ---
> >
> > You're breaking the ABI - version bump for affected libraries is needed.
> >
> > --
> > Thanks,
> > Anatoly
>
> If I'm reading the versioning guidelines correctly, I'll need to gate the changes with the RTE_NEXT_ABI macro and provide a deprecation notice, then after a full deprecation cycle we can revert that and bump the library version. Not to mention the 3 ML ACKs.
>
> I'll address this in v2.
My understanding is that RTE_NEXT_API method is not used any more. Replaced by rte_experimental.
But this kind of change is more of a flag day event. Which means it needs to be pushed
off to a release that is planned as an ABI break (usually once a year) which would
mean 19.11.
More information about the dev
mailing list