[RFC v2] non-temporal memcpy

Mattias Rönnblom hofors at lysator.liu.se
Wed Aug 10 13:55:37 CEST 2022


On 2022-08-09 17:26, Stephen Hemminger wrote:
> On Tue, 9 Aug 2022 11:46:19 +0200
> Morten Brørup <mb at smartsharesystems.com> wrote:
> 
>>>
>>> I don't think memcpy() functions should have alignment requirements.
>>> That's not very practical, and violates the principle of least
>>> surprise.
>>
>> I didn't make the CPUs with these alignment requirements.
>>
>> However, I will offer optimized performance in a generic NT memcpy() function in the cases where the individual alignment requirements of various CPUs happen to be met.
> 
> Rather than making a generic equivalent memcpy function, why not have
> something which only takes aligned data. And to avoid user confusion
> change the name to be something not suggestive of memcpy.
> 

Alignment seems like a non-issue to me. A NT-store memcpy() can be made 
free of alignment requirements, incurring only a very slight cost for 
the always-aligned case (who has their data always 16-byte aligned 
anyways?).

The memory barrier required on x86 seems like a bigger issue.

> Maybe rte_non_cache_copy()?
> 

rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a 
rte_memcpy_nt() with the sfence is place, which the user hopefully will 
find first? I don't know. I would prefer not having the weak variant at all.

Accepting weak memory ordering (i.e., no sfence) could also be one of 
the flags, assuming rte_memcpy_nt() would have a flags parameter. 
Default is safe (=memcpy() semantics), but potentially slower.

> Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and expect
> everything to work.


More information about the dev mailing list