[dpdk-dev] 回复: [RFC PATCH v3 1/5] eal: add new definitions for wait scheme

Feifei Wang Feifei.Wang2 at arm.com
Thu Oct 14 05:08:09 CEST 2021

Previous message (by thread): [dpdk-dev] 回复: [RFC PATCH v3 1/5] eal: add new definitions for wait scheme
Next message (by thread): [dpdk-dev] [PATCH v2 07/18] acl: fix spelling errors
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> -----邮件原件-----
> 发件人: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> 发送时间: Wednesday, October 13, 2021 11:04 PM
> 收件人: Feifei Wang <Feifei.Wang2 at arm.com>; Ruifeng Wang
> <Ruifeng.Wang at arm.com>
> 抄送: dev at dpdk.org; nd <nd at arm.com>; nd <nd at arm.com>
> 主题: RE: [dpdk-dev] [RFC PATCH v3 1/5] eal: add new definitions for wait
> scheme
> 
> >
> > [snip]
> >
> > > > diff --git a/lib/eal/include/generic/rte_pause.h
> > > > b/lib/eal/include/generic/rte_pause.h
> > > > index 668ee4a184..4e32107eca 100644
> > > > --- a/lib/eal/include/generic/rte_pause.h
> > > > +++ b/lib/eal/include/generic/rte_pause.h
> > > > @@ -111,6 +111,84 @@ rte_wait_until_equal_64(volatile uint64_t
> > > > *addr,
> > > uint64_t expected,
> > > >  	while (__atomic_load_n(addr, memorder) != expected)
> > > >  		rte_pause();
> > > >  }
> > > > +
> > > > +/*
> > > > + * Wait until a 16-bit *addr breaks the condition, with a relaxed
> > > > +memory
> > > > + * ordering model meaning the loads around this API can be reordered.
> > > > + *
> > > > + * @param addr
> > > > + *  A pointer to the memory location.
> > > > + * @param mask
> > > > + *  A mask of value bits in interest
> > > > + * @param expected
> > > > + *  A 16-bit expected value to be in the memory location.
> > > > + * @param cond
> > > > + *  A symbol representing the condition (==, !=).
> > > > + * @param memorder
> > > > + *  Two different memory orders that can be specified:
> > > > + *  __ATOMIC_ACQUIRE and __ATOMIC_RELAXED. These map to
> > > > + *  C++11 memory orders with the same names, see the C++11
> > > > +standard or
> > > > + *  the GCC wiki on atomic synchronization for detailed definition.
> > > > + */
> > >
> > > Hmm, so now we have 2 APIs doing similar thing:
> > > rte_wait_until_equal_n() and rte_wait_event_n().
> > > Can we probably unite them somehow?
> > > At least make rte_wait_until_equal_n() to use rte_wait_event_n()
> underneath.
> > >
> > You are right. We plan to change rte_wait_until_equal API after this
> > new scheme is achieved.  And then, we will merge wait_unil into
> > wait_event definition in the next new patch series.
> >
> > > > +#define rte_wait_event_16(addr, mask, expected, cond, memorder)
> > > 		       \
> > > > +do {
> 	       \
> > > > +	assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > > > +__ATOMIC_RELAXED);  \
> > >
> > > And why user is not allowed to use __ATOMIC_SEQ_CST here?
> > Actually this is just a load operation, and acquire here is enough to
> > make sure 'load addr value' can be before other operations.
> >
> > > BTW, if we expect memorder to always be a constant, might be better
> > > BUILD_BUG_ON()?
> > If I understand correctly, you means we can replace 'assert' by
> 'build_bug_on':
> > RTE_BUILD_BUG_ON(memorder != __ATOMIC_ACQUIRE && memorder
> > !=__ATOMIC_RELAXED);
> 
> Yes, that was my thought.
> In that case I think we should be able to catch wrong memorder at compilation
> stage.
> 
> >
> > >
> > > > +									       \
> > > > +	while ((__atomic_load_n(addr, memorder) & mask) cond expected)
> > > 	       \
> > > > +		rte_pause();						       \
> > > > +} while (0)
> > >
> > > Two thoughts with these macros:
> > > 1. It is a goof practise to put () around macro parameters in the macro
> body.
> > > Will save from a lot of unexpected troubles.
> > > 2. I think these 3 macros can be united into one.
> > > Something like:
> > >
> > > #define rte_wait_event(addr, mask, expected, cond, memorder) do {\
> > >         typeof (*(addr)) val = __atomic_load_n((addr), (memorder)); \
> > >         if ((val & (typeof(val))(mask)) cond (typeof(val))(expected)) \
> > >                 break; \
> > >         rte_pause(); \
> > > } while (1);
> > For this point, I think it is due to different size need to use
> > different assembly instructions in arm architecture. For example, load
> > 16 bits instruction is "ldxrh %w[tmp], [%x[addr]"
> > load 32 bits instruction is " ldxr %w[tmp], [%x[addr]"
> > load 64 bits instruction is " ldxr %x[tmp], [%x[addr] "
> 
> Ok, but it could be then something like that for arm specific code:
> if (sizeof(val) == sizeof(uint16_t)) \
> 	__LOAD_EXC_16(...); \
> else if (sizeof(val) == sizeof(uint32_t)) \
> 	__LOAD_EXC_32(...); \
> else if (sizeof(val) == sizeof(uint64_t)) \
> 	__LOAD_EXC_64(...); \
> ...
> 
I thinks we should use "addr" as judgement:

rte_wait_event(addr, mask, expected, cond, memorder)
if (sizeof(*addr)) == sizeof(uint16_t) 
	uint16_t value                                                         \
	assert(memorder == __ATOMIC_ACQUIRE || memorder == __ATOMIC_RELAXED);  \
	__LOAD_EXC_16(addr, value, memorder)				       \
	if ((value & mask) cond expected) {				       \
		__SEVL()						       \
		do {							       \
			__WFE()						       \
			__LOAD_EXC_16(addr, value, memorder)		       \
		} while ((value & mask) cond expected);			       \
	}	
if (sizeof(*addr)) == sizeof(uint32_t) 
	..........
if (sizeof(*addr)) == sizeof(uint64_t) 
	...........

> > And for consistency, we also use 3 APIs in generic path.
> Honestly, even one multi-line macro doesn't look nice.
> Having 3 identical ones looks even worse.

Previous message (by thread): [dpdk-dev] 回复: [RFC PATCH v3 1/5] eal: add new definitions for wait scheme
Next message (by thread): [dpdk-dev] [PATCH v2 07/18] acl: fix spelling errors
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list