[dpdk-dev] [PATCH v2] ring: enforce reading the tails before ring operations

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Fri Mar 8 06:27:31 CET 2019


Hi Konstantin,

> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Sent: Friday, March 8, 2019 11:21 AM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Gavin Hu (Arm
> Technology China) <Gavin.Hu at arm.com>; Ilya Maximets
> <i.maximets at samsung.com>; dev at dpdk.org
> Cc: nd <nd at arm.com>; thomas at monjalon.net; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Nipun.gupta at nxp.com; olivier.matz at 6wind.com;
> Richardson, Bruce <bruce.richardson at intel.com>;
> chaozhu at linux.vnet.ibm.com; nd <nd at arm.com>
> Subject: RE: [PATCH v2] ring: enforce reading the tails before ring operations
> 
> > Hi Gavin,
> >
> > > > >>> In weak memory models, like arm64, reading the {prod,cons}.tail
> > > > >>> may get reordered after reading or writing the ring slots, which
> > > > >>> corrupts the ring and stale data is observed.
> > > > >>>
> > > > >>> This issue was reported by NXP on 8-A72 DPAA2 board. The
> problem
> > > > >>> is
> > > > >> most
> > > > >>> likely caused by missing the acquire semantics when reading
> > > > >>> cons.tail (in SP enqueue) or prod.tail (in SC dequeue) which
> > > > >>> makes it possible to
> > > > read
> > > > >> a
> > > > >>> stale value from the ring slots.
> > > > >>>
> > > > >>> For MP (and MC) case, rte_atomic32_cmpset() already provides
> the
> > > > >> required
> > > > >>> ordering. This patch is to prevent reading and writing the ring
> > > > >>> slots get reordered before reading {prod,cons}.tail for SP (and SC)
> > case.
> > > > >>
> > > > >> Read barrier rte_smp_rmb() is OK to prevent reading the ring get
> > > > >> reordered before reading the tail. However, to prevent *writing*
> > > > >> the ring get reordered *before reading* the tail you need a full
> > > > >> memory barrier, i.e.
> > > > >> rte_smp_mb().
> > > > >
> > > > > ISHLD(rte_smp_rmb is DMB(ishld) orders LD/LD and LD/ST, while
> > > > > WMB(ST
> > > > Option) orders ST/ST.
> > > > > For more details, please refer to: Table B2-1 Encoding of the DMB
> > > > > and DSB
> > > > <option> parameter  in
> > > > > https://developer.arm.com/docs/ddi0487/latest/arm-architecture-
> > > > reference-manual-armv8-for-armv8-a-architecture-profile
> > > >
> > > > I see. But you have to change the rte_smp_rmb() function definition
> > > > in lib/librte_eal/common/include/generic/rte_atomic.h and assure
> > > > that all other architectures follows same rules.
> > > > Otherwise, this change is logically wrong, because read barrier in
> > > > current definition could not be used to order Load with Store.
> > > >
> > >
> > > Good points, let me re-think how to handle for other architectures.
> > > Full MB is required for other architectures(x86? Ppc?), but for arm, read
> > barrier(load/store and load/load) is enough.
> >
> > For x86, I don't think you need any barrier here, as with IA memory mode:
> > -  Reads are not reordered with other reads.
> > - Writes are not reordered with older reads.
> Agree

I understand herein no instruction level barriers are required for IA, but how about the
compiler barrier: rte_compiler_barrier? 

> 
> >
> > BTW, could you explain a bit more why barrier is necessary even on arm
> here?
> > As I can see, there is a data dependency between the tail value and
> > subsequent address calculations for ring writes/reads.
> > Isn't that sufficient to prevent re-ordering even for weak memory model?
> The tail value affects 'n'. But, the value of 'n' can be speculated because of
> the following 'if' statement:
> 
> if (unlikely(n > *free_entries))
>                         n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *free_entries;
> 
> The address calculations for actual ring writes/reads do not depend on the
> tail value. Since 'n' can be speculated, the writes/reads can be moved up
> before the load of the tail value.

Good explanation. The address calculations does not depend on tail/n, only the
limit/last one depends on it, while it can be speculated. 

> > Konstantin
> >
> >
> <snip>


More information about the dev mailing list