[dpdk-dev] [PATCH v2] ring: enforce reading the tails before ring operations

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Fri Mar 8 16:05:14 CET 2019


Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> Sent: Friday, March 8, 2019 8:13 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>; Ilya
> Maximets <i.maximets at samsung.com>; dev at dpdk.org
> Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; olivier.matz at 6wind.com; Richardson,
> Bruce <bruce.richardson at intel.com>; chaozhu at linux.vnet.ibm.com
> Subject: RE: [PATCH v2] ring: enforce reading the tails before ring
> operations
> 
> 
> 
> > -----Original Message-----
> > From: Gavin Hu (Arm Technology China) [mailto:Gavin.Hu at arm.com]
> > Sent: Friday, March 8, 2019 4:23 AM
> > To: Ilya Maximets <i.maximets at samsung.com>; dev at dpdk.org
> > Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; Honnappa
> > Nagarahalli <Honnappa.Nagarahalli at arm.com>;
> olivier.matz at 6wind.com; Richardson, Bruce
> <bruce.richardson at intel.com>; Ananyev,
> > Konstantin <konstantin.ananyev at intel.com>;
> chaozhu at linux.vnet.ibm.com
> > Subject: RE: [PATCH v2] ring: enforce reading the tails before ring
> operations
> >
> >
> >
> > > -----Original Message-----
> > > From: Gavin Hu (Arm Technology China)
> > > Sent: Thursday, March 7, 2019 6:45 PM
> > > To: Ilya Maximets <i.maximets at samsung.com>; dev at dpdk.org
> > > Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> > > hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; Honnappa
> Nagarahalli
> > > <Honnappa.Nagarahalli at arm.com>; olivier.matz at 6wind.com;
> > > bruce.richardson at intel.com; konstantin.ananyev at intel.com;
> > > chaozhu at linux.vnet.ibm.com
> > > Subject: RE: [PATCH v2] ring: enforce reading the tails before ring
> operations
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ilya Maximets <i.maximets at samsung.com>
> > > > Sent: Thursday, March 7, 2019 5:48 PM
> > > > To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>;
> > > dev at dpdk.org
> > > > Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> > > > hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; Honnappa
> Nagarahalli
> > > > <Honnappa.Nagarahalli at arm.com>; olivier.matz at 6wind.com
> > > > Subject: Re: [PATCH v2] ring: enforce reading the tails before ring
> > > > operations
> > > >
> > > > On 07.03.2019 12:27, Gavin Hu (Arm Technology China) wrote:
> > > > >
> > > > >
> > > > >> -----Original Message-----
> > > > >> From: Ilya Maximets <i.maximets at samsung.com>
> > > > >> Sent: Thursday, March 7, 2019 4:52 PM
> > > > >> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>;
> > > > >> dev at dpdk.org
> > > > >> Cc: nd <nd at arm.com>; thomas at monjalon.net;
> jerinj at marvell.com;
> > > > >> hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; Honnappa
> > > Nagarahalli
> > > > >> <Honnappa.Nagarahalli at arm.com>; olivier.matz at 6wind.com
> > > > >> Subject: Re: [PATCH v2] ring: enforce reading the tails before ring
> > > > >> operations
> > > > >>
> > > > >> On 07.03.2019 9:45, gavin hu wrote:
> > > > >>> In weak memory models, like arm64, reading the {prod,cons}.tail
> may
> > > get
> > > > >>> reordered after reading or writing the ring slots, which corrupts
> the
> > > ring
> > > > >>> and stale data is observed.
> > > > >>>
> > > > >>> This issue was reported by NXP on 8-A72 DPAA2 board. The
> problem
> > > is
> > > > >> most
> > > > >>> likely caused by missing the acquire semantics when reading
> cons.tail
> > > (in
> > > > >>> SP enqueue) or prod.tail (in SC dequeue) which makes it possible
> to
> > > > read
> > > > >> a
> > > > >>> stale value from the ring slots.
> > > > >>>
> > > > >>> For MP (and MC) case, rte_atomic32_cmpset() already provides
> the
> > > > >> required
> > > > >>> ordering. This patch is to prevent reading and writing the ring
> slots get
> > > > >>> reordered before reading {prod,cons}.tail for SP (and SC) case.
> > > > >>
> > > > >> Read barrier rte_smp_rmb() is OK to prevent reading the ring get
> > > > >> reordered
> > > > >> before reading the tail. However, to prevent *writing* the ring get
> > > > >> reordered
> > > > >> *before reading* the tail you need a full memory barrier, i.e.
> > > > >> rte_smp_mb().
> > > > >
> > > > > ISHLD(rte_smp_rmb is DMB(ishld) orders LD/LD and LD/ST, while
> > > WMB(ST
> > > > Option) orders ST/ST.
> > > > > For more details, please refer to: Table B2-1 Encoding of the DMB
> and
> > > DSB
> > > > <option> parameter  in
> > > > > https://developer.arm.com/docs/ddi0487/latest/arm-architecture-
> > > > reference-manual-armv8-for-armv8-a-architecture-profile
> > > >
> > > > I see. But you have to change the rte_smp_rmb() function definition
> in
> > > > lib/librte_eal/common/include/generic/rte_atomic.h and assure that
> all
> > > > other architectures follows same rules.
> > > > Otherwise, this change is logically wrong, because read barrier in
> current
> > > > definition could not be used to order Load with Store.
> > > >
> > >
> > > Good points, let me re-think how to handle for other architectures.
> > > Full MB is required for other architectures(x86? Ppc?), but for arm,
> read
> > > barrier(load/store and load/load) is enough.
> >
> > Hi Ilya,
> >
> > I would expand the rmb definition to cover load/store, in addition to
> load/load.
> > For X86, as a strong memory order model, rmb is actually equivalent to
> mb,
> 
> That's not exactly the case, on x86 we have:
> smp_rmb == compiler_barrier
> smp_mb is a proper memory barrier.
> 
> Konstantin

Sorry I did not make it clear.
Anyway, on x86, smp_rmb, as a compiler barrier, applies to load/store, not only load/load.
This is the case also for arm, arm64, ppc32, ppc64.
I will submit a patch to expand the definition of this API. 

> 
> > as implemented as a compiler barrier: rte_compiler_barrier,
> > arm32 is also this case.
> > For PPC, both 32 and 64-bit, rmb=wmb=mb, lwsync/sync orders
> load/store, load/load, store/load, store/store, looking at the table on this
> > page:
> > https://www.ibm.com/developerworks/systems/articles/powerpc.html
> >
> > In summary, we are safe to expand this definition for all the
> architectures DPDK support?
> > Any comments are welcome!
> >
> > BR. Gavin
> >


More information about the dev mailing list