[dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed vring desc avail flags

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Thu Sep 12 10:21:04 CEST 2019


Hi Bruce,
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson at intel.com>
> Sent: Wednesday, September 11, 2019 6:03 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>
> Cc: Liu, Yong <yong.liu at intel.com>; Wang, Yinan <yinan.wang at intel.com>;
> Maxime Coquelin <maxime.coquelin at redhat.com>; Joyce Kong (Arm
> Technology China) <Joyce.Kong at arm.com>; dev at dpdk.org; nd
> <nd at arm.com>; Bie, Tiwei <tiwei.bie at intel.com>; Wang, Zhihong
> <zhihong.wang at intel.com>; amorenoz at redhat.com; Wang, Xiao W
> <xiao.w.wang at intel.com>; jfreimann at redhat.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; Steve Capper <Steve.Capper at arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed vring
> desc avail flags
> 
> On Wed, Sep 11, 2019 at 08:32:16AM +0000, Gavin Hu (Arm Technology China)
> wrote:
> > Thanks Marvin, my inline comments.
> >
> > > -----Original Message-----
> > > From: Liu, Yong <yong.liu at intel.com>
> > > Sent: Wednesday, September 11, 2019 2:30 PM
> > > To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>; Wang, Yinan
> > > <yinan.wang at intel.com>; Maxime Coquelin
> <maxime.coquelin at redhat.com>;
> > > Joyce Kong (Arm Technology China) <Joyce.Kong at arm.com>;
> dev at dpdk.org
> > > Cc: nd <nd at arm.com>; Bie, Tiwei <tiwei.bie at intel.com>; Wang, Zhihong
> > > <zhihong.wang at intel.com>; amorenoz at redhat.com; Wang, Xiao W
> > > <xiao.w.wang at intel.com>; jfreimann at redhat.com; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli at arm.com>; Steve Capper
> <Steve.Capper at arm.com>
> > > Subject: RE: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed
> vring
> > > desc avail flags
> > >
> > > Thanks Gavin, my answers are inline.
> > >
> > > > -----Original Message-----
> > > > From: Gavin Hu (Arm Technology China) [mailto:Gavin.Hu at arm.com]
> > > > Sent: Wednesday, September 11, 2019 11:35 AM
> > > > To: Liu, Yong <yong.liu at intel.com>; Wang, Yinan
> <yinan.wang at intel.com>;
> > > > Maxime Coquelin <maxime.coquelin at redhat.com>; Joyce Kong (Arm
> > > Technology
> > > > China) <Joyce.Kong at arm.com>; dev at dpdk.org
> > > > Cc: nd <nd at arm.com>; Bie, Tiwei <tiwei.bie at intel.com>; Wang,
> Zhihong
> > > > <zhihong.wang at intel.com>; amorenoz at redhat.com; Wang, Xiao W
> > > > <xiao.w.wang at intel.com>; jfreimann at redhat.com; Honnappa
> Nagarahalli
> > > > <Honnappa.Nagarahalli at arm.com>; Steve Capper
> <Steve.Capper at arm.com>
> > > > Subject: RE: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed
> > > > vring desc avail flags
> > > >
> > > > Hi Marvin,
> > > >
> > > > Thanks for your answers, one more question for x86:
> > > > 1. For CIO memory alone or MMIO memory(eg PCI BAR) alone, the
> compiler
> > > > barrier is enough to keep ordering, that's why both rte_io_mb and
> > > > rte_cio_mb are defined as compiler barriers, right?
> > >
> > > Yes, that's right for x86.
> > >
> > > > 2. How about the ordering of interleaved CIO and MMIO accesses, for
> > > example,
> > > > a young store to MMIO can be reordered before an older store to CIO?
> CIO
> > > > may be faster than devices, but store buffers or caching may cause the
> CIO
> > > > update not visible to the device(in a common doorbell case)?
> > > >
> > >
> > > There's always one kind of cache coherent engine in x86 uncore sub-
> system.
> > > When CIO write instruction was retried, data will be in CPU LLC.
> > > When device doing inbound read, request will go to cache engine first and
> > > then check memory state and retrieve latest value.
> > I understand your words that the cache coherent engine is working like a
> hub/coordinator/arbiter for all the accesses to three types of memory: 1 -
> normal memory, 2 - CIO memory, 3 - MMIO memory, and the ordering
> behaviors are no different?
> > Then in what scenarios mfence/sfence/lfence should be used?  Maybe just
> mfence is enough to keep orderings of store/load(which is the only one might
> reordered on x86)?
> > >
> 
> The fence types needed will depend on the memory types used, for example,
> any memory mapped as write-combining will have different behaviour and
> need different fences to the regular write-back memory we are most familiar
> with. For the situations we deal with in DPDK, for regular memory writes
> and MMIO writes, reads won't be reordered with other reads, and writes
> won't be reordered with other writes, so therefore, as you point out, the
> mfence instruction is only rarely needed, and barriers to prevent compiler
> reordering are sufficient in nearly all cases.
Thanks for your explanation about the barriers on x86, it is really helpful for us
to optimize PMDs for aarch64 by using less restrictive barriers while not breaking x86 platforms.

> 
> /Bruce


More information about the dev mailing list