[RFC] random: use per lcore state
Mattias Rönnblom
hofors at lysator.liu.se
Mon Sep 11 18:53:29 CEST 2023
On 2023-09-11 18:06, Stephen Hemminger wrote:
> On Fri, 8 Sep 2023 09:04:29 +0200
> Mattias Rönnblom <hofors at lysator.liu.se> wrote:
>
>>> Also, right now the array is sized at 129 entries to allow for the
>>> maximum number of lcores. When the maximum is increased to 512 or
>>> 1024 the problem will get worse.
>>
>> Using TLS will penalize every thread in the process, not only EAL
>> threads and registered non-EAL threads, and worse: not only threads
>> that are using the API in question.
>>
>> Every thread will carry the TLS memory around, increasing the process
>> memory footprint.
>>
>> Thread creation will be slower, since TLS memory is allocated *and
>> initialized*, lazy user code-level initialization or not.
>>
>> On my particular Linux x86_64 system, pthread creation overhead looks
>> something like:
>>
>> 8 us w/o any user code-level use of TLS
>> 11 us w/ 16 kB of TLS
>> 314 us w/ 2 MB of TLS.
>
> Agree that TLS does cause potentially more pages to get allocated on
> thread creation, but that argument doesn't make sense here.
Sure. I was talking about the general concept of replacing per-lcore
static arrays with TLS.
I find the general applicability of the TLS pattern related because it
doesn't make sense to have an ad-hoc, opportunistic way to implement
essentially the same thing across the DPDK code base.
> The rand
> state is small, and DPDK applications should not be creating threads
> after startup. Thread creation is an expensive set of system calls.
I agree, and I would add that non-EAL threads will likely be few in
numbers, and should all be registered on creation, to assure they can
call DPDK APIs which require a lcore id.
That said, if application do create threads, DPDK shouldn't make the
thread creation order of magnitudes slower.
More information about the dev
mailing list