[dpdk-dev] [PATCH v2 1/5] hash: add new toeplitz hash implementation

Medvedkin, Vladimir vladimir.medvedkin at intel.com
Tue Oct 19 17:42:41 CEST 2021


Hi Stephen,

On 19/10/2021 03:15, Stephen Hemminger wrote:
> On Mon, 18 Oct 2021 10:40:00 +0000
> "Ananyev, Konstantin" <konstantin.ananyev at intel.com> wrote:
> 
>>> On Fri, 15 Oct 2021 10:30:02 +0100
>>> Vladimir Medvedkin <vladimir.medvedkin at intel.com> wrote:
>>>    
>>>> +			m[i * 8 + j] = (rss_key[i] << j)|
>>>> +				(uint8_t)((uint16_t)(rss_key[i + 1]) >>
>>>> +				(8 - j));
>>>> +		}
>>>
>>> This ends up being harder than necessary to read. Maybe split into
>>> multiple statements and/or use temporary variable.
>>>    
>>>> +RTE_INIT(rte_thash_gfni_init)
>>>> +{
>>>> +	rte_thash_gfni_supported = 0;
>>>
>>> Not necessary in C globals are initialized to zero by default.
>>>
>>> By removing that the constructor can be totally behind #ifdef
>>>    
>>>> +__rte_internal
>>>> +static inline __m512i
>>>> +__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple,
>>>> +	const uint8_t *secondary_tuple, int len)
>>>> +{
>>>> +	__m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4,
>>>> +						6, 5, 4, 3, 6, 5, 4, 3,
>>>> +						5, 4, 3, 2, 5, 4, 3, 2,
>>>> +						4, 3, 2, 1, 4, 3, 2, 1,
>>>> +						3, 2, 1, 0, 3, 2, 1, 0,
>>>> +						2, 1, 0, -1, 2, 1, 0, -1,
>>>> +						1, 0, -1, -2, 1, 0, -1, -2,
>>>> +						0, -1, -2, -3, 0, -1, -2, -3);
>>>
>>> NAK
>>>
>>> Please don't put the implementation in an inline. This makes it harder
>>> to support (API/ABI) and blocks other architectures from implementing
>>> same thing with different instructions.
>>
>> I don't really understand your reasoning here.
>> rte_thash_gfni.h is an arch-specific header, which provides
>> arch-specific optimizations for RSS hash calculation
>> (Vladimir pls correct me if I am wrong here).
> 
> Ok, but rte_thash_gfni.h is included on all architectures.
> 

Ok, I'll rework the patch to move x86 + avx512 related things into x86 
arch specific header. Would that suit?

>> We do have dozens of inline functions that do use arch-specific instructions (both x86 and arm)
>> for different purposes:
>> sync primitives, memory-ordering, cache manipulations, LPM lookup, TSX, power-saving, etc.
>> That's a usual trade-off taken for performance reasons, when extra function call
>> costs too much comparing to the operation itself.
>> Why it suddenly became a problem for that particular case and how exactly it blocks other architectures?
>> Also I don't understand how it makes things harder in terms of API/ABI stability.
>> As I can see this patch doesn't introduce any public structs/unions.
>> All functions take as arguments just raw data buffers and length.
>> To summarize - in general, I don't see any good reason why this patch shouldn't be allowed.
>> Konstantin
> 
> The comments about rte_thash_gfni_supported initialization still apply.
> Why not:
> 
> #ifdef __GFNI__
> RTE_INIT(rte_thash_gfni_init)
> {
> 	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_GFNI))
> 		rte_thash_gfni_supported = 1;
> }
> #endif
> 

Agree, I'll reflect this changes in v3.


-- 
Regards,
Vladimir


More information about the dev mailing list