[RFC v2] non-temporal memcpy
Mattias Rönnblom
hofors at lysator.liu.se
Wed Aug 10 13:55:37 CEST 2022
On 2022-08-09 17:26, Stephen Hemminger wrote:
> On Tue, 9 Aug 2022 11:46:19 +0200
> Morten Brørup <mb at smartsharesystems.com> wrote:
>
>>>
>>> I don't think memcpy() functions should have alignment requirements.
>>> That's not very practical, and violates the principle of least
>>> surprise.
>>
>> I didn't make the CPUs with these alignment requirements.
>>
>> However, I will offer optimized performance in a generic NT memcpy() function in the cases where the individual alignment requirements of various CPUs happen to be met.
>
> Rather than making a generic equivalent memcpy function, why not have
> something which only takes aligned data. And to avoid user confusion
> change the name to be something not suggestive of memcpy.
>
Alignment seems like a non-issue to me. A NT-store memcpy() can be made
free of alignment requirements, incurring only a very slight cost for
the always-aligned case (who has their data always 16-byte aligned
anyways?).
The memory barrier required on x86 seems like a bigger issue.
> Maybe rte_non_cache_copy()?
>
rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a
rte_memcpy_nt() with the sfence is place, which the user hopefully will
find first? I don't know. I would prefer not having the weak variant at all.
Accepting weak memory ordering (i.e., no sfence) could also be one of
the flags, assuming rte_memcpy_nt() would have a flags parameter.
Default is safe (=memcpy() semantics), but potentially slower.
> Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and expect
> everything to work.
More information about the dev
mailing list