[dpdk-dev] memory barriers in rte_ring

Stephen Hemminger stephen at networkplumber.org
Thu Mar 27 20:06:20 CET 2014


On Thu, 27 Mar 2014 17:48:21 +0100
Olivier MATZ <olivier.matz at 6wind.com> wrote:

> Hi,
> 
> The commit 286bd05bf7 [1] removed the memory barriers in the ring
> functions. This patch is present in DPDK since version 1.4.0r0, so I
> guess it does not cause any issue.
> 
> But after checking the excellent Linux kernel documentation about memory
> barriers [2], I'm wondering why memory barriers would not be required in
> that case.
> 
> To illustrate the previous behavior (before dpdk 1.4):
> 
>    ring_enqueue()
>      - move producer_head to reserve space in ring (atomically if
>        multi producers)
>      - write objects between producer_head and producer_tail
>      - wmb() to ensure that STORE operations are issued
>      - write producer_tail
> 
>    ring_dequeue()
>      - move consumer_head (atomically if multi consumers)
>      - rmb() to ensure that LOAD operations are issued: the read of
>        consumer_head must occur before the reading of objects ptrs.
>        In fact, rmb() is probably not needed here because knowing the
>        value of consumer_head is required before reading the objects
>        table.
>      - read objects between consumer_head and consumer_tail
>      - write consumer_tail
> 
> The memory barriers have been removed, but in my understanding at least
> the wmb() would be needed according to the generic memory barrier
> documentation. Maybe this is not needed on newest Intel processors?
> Could anyone from Intel enlight me on this?
> 
> Thanks & regards,
> Olivier
> 
> 
> [1] 
> http://dpdk.org/browse/dpdk/commit/lib/librte_ring/rte_ring.h?id=286bd05bf70d1da1b6017007276c267a1e012c1d
> 
> [2] http://lxr.free-electrons.com/source/Documentation/memory-barriers.txt

Short answer, only a compiler barrier is necessary.

Long answer: for the multple CPU access ring, it is equivalent to smp_wmb and smp_rmb
 in Linux kernel. For x86 where DPDK is used, this can normally be replaced by simpler
 compiler barrier. In kernel there is a special flage X86_OOSTORE which is only enabled
 for a few special cases, for most cases it is not. When cpu doesnt do out of order
 stores, there are no cases where other cpu will see wrong state.


More information about the dev mailing list