[dpdk-dev] [PATCH v2 09/29] eal/arm64: define I/O device memory barriers for arm64

Jerin Jacob jerin.jacob at caviumnetworks.com
Thu Jan 5 07:24:31 CET 2017


On Thu, Jan 05, 2017 at 01:31:44PM +0800, Jianbo Liu wrote:
> On 4 January 2017 at 18:01, Jerin Jacob <jerin.jacob at caviumnetworks.com> wrote:
> > On Tue, Jan 03, 2017 at 03:48:32PM +0800, Jianbo Liu wrote:
> >> On 27 December 2016 at 17:49, Jerin Jacob
> >> <jerin.jacob at caviumnetworks.com> wrote:
> >> > CC: Jianbo Liu <jianbo.liu at linaro.org>
> >> > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> >> > ---
> >> >  lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> >> > index 78ebea2..ef0efc7 100644
> >> > --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> >> > +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> >> > @@ -88,6 +88,12 @@ static inline void rte_rmb(void)
> >> >
> >> >  #define rte_smp_rmb() dmb(ishld)
> >> >
> >> > +#define rte_io_mb() rte_mb()
> >> > +
> >> > +#define rte_io_wmb() rte_wmb()
> >> > +
> >> > +#define rte_io_rmb() rte_rmb()
> >> > +
> >>
> >> I think it's better to use outer shareable dmb for io barrier, instead of dsb.
> >
> > Its is difficult to generalize. AFAIK, from the IO barrier perspective
> > dsb would be the right candidate. But just for the DMA barrier between IO may
> > be outer sharable dmb is enough. In-terms of performance implication, the
> > fastpath code(door bell write) has been changed to relaxed write in all
> > the drivers in this patchset and rte_io_* will be only
> > used by rte_[read/write]8/16/32/64 which will be in slow-path.
> > So, IMO, it better stick with dsb and its safe from the complete IO barrier
> > perspective.
> 
> If so, why not use *mb() directly?

Adding David Marchand, EAL Maintainer.

Instead of rte_io_?. I thought, IO specific constraints can be abstracted
here in rte_io_*. Apart from arm, there other arch like "arc" has similar
constraints. IMHO, no harm in keeping that abstraction.

Thoughts ?

http://lxr.free-electrons.com/ident?i=__iormb

> 
> >
> > At least on ThunderX, I couldn't see any performance difference between
> > using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
> > write in fastpath. In case there are platforms which has such performance difference,
> > may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
> > dma_wmb() and dma_rmb().(But i couldn't  see all the driver are using it,
> > though)
> >
> 
> But there is no io_*mb() in the kernel, so you want to be different?

It is their for arm,arm64,arc architectures in Linux kernel. Please check writel
implementation for arm64

http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L143



More information about the dev mailing list