[dpdk-dev] [PATCH v5 1/1] ring: guarantee load/load order in enqueue and dequeue

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Nov 10 10:59:13 CET 2017



> -----Original Message-----
> From: Jia He [mailto:hejianet at gmail.com]
> Sent: Friday, November 10, 2017 1:51 AM
> To: jerin.jacob at caviumnetworks.com; dev at dpdk.org; olivier.matz at 6wind.com
> Cc: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Richardson, Bruce <bruce.richardson at intel.com>; jianbo.liu at arm.com;
> hemant.agrawal at nxp.com; Jia He <hejianet at gmail.com>; Jia He <jia.he at hxt-semitech.com>; jie2.liu at hxt-semitech.com; bing.zhao at hxt-
> semitech.com
> Subject: [PATCH v5 1/1] ring: guarantee load/load order in enqueue and dequeue
> 
> We watched a rte panic of mbuf_autotest in our qualcomm arm64 server.
> In __rte_ring_move_cons_head()
> ...
>         do {
>                 /* Restore n as it may change every loop */
>                 n = max;
> 
>                 *old_head = r->cons.head;                //1st load
>                 const uint32_t prod_tail = r->prod.tail; //2nd load
> 
> cpu1(producer)          cpu2(consumer)          cpu3(consumer)
>                         load r->prod.tail
> in enqueue:
> load r->cons.tail
> load r->prod.head
> 
> store r->prod.tail
> 
>                                                 load r->cons.head
>                                                 load r->prod.tail
>                                                 ...
>                                                 store r->cons.{head,tail}
>                         load r->cons.head
> 
> In weak memory order architectures(powerpc,arm), the 2nd load might be
> reodered before the 1st load, that makes *entries is bigger than we
> wanted. This nasty reording messed enque/deque up. Then, r->cons.head
> will be bigger than prod_tail, then make *entries very big and the
> consumer will go forward incorrectly.
> 
> After this patch, even with above context switches, the old cons.head
> will be recaculated after failure of rte_atomic32_cmpset. So no race
> conditions left.
> 
> There is no such issue on X86, because X86 is strong memory order model.
> But rte_smp_rmb() doesn't have impact on runtime performance on X86, so
> keep the same code without architectures specific concerns.
> 
> Signed-off-by: Jia He <jia.he at hxt-semitech.com>
> Signed-off-by: jie2.liu at hxt-semitech.com
> Signed-off-by: bing.zhao at hxt-semitech.com
> ---
>  lib/librte_ring/rte_ring.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 5e9b3b7..3e8085a 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -409,6 +409,11 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
>  		n = max;
> 
>  		*old_head = r->prod.head;
> +
> +		/* add rmb barrier to avoid load/load reorder in weak
> +		 * memory model. It is noop on x86 */
> +		rte_smp_rmb();
> +
>  		const uint32_t cons_tail = r->cons.tail;
>  		/*
>  		 *  The subtraction is done between two unsigned 32bits value
> @@ -517,6 +522,11 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
>  		n = max;
> 
>  		*old_head = r->cons.head;
> +
> +		/* add rmb barrier to avoid load/load reorder in weak
> +		 * memory model. It is noop on x86 */
> +		rte_smp_rmb();
> +
>  		const uint32_t prod_tail = r->prod.tail;
>  		/* The subtraction is done between two unsigned 32bits value
>  		 * (the result is always modulo 32 bits even if we have
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>

> 2.7.4



More information about the dev mailing list