[RFC] random: use per lcore state

Mattias Rönnblom hofors at lysator.liu.se
Mon Sep 11 18:53:29 CEST 2023


On 2023-09-11 18:06, Stephen Hemminger wrote:
> On Fri, 8 Sep 2023 09:04:29 +0200
> Mattias Rönnblom <hofors at lysator.liu.se> wrote:
> 
>>> Also, right now the array is sized at 129 entries to allow for the
>>> maximum number of lcores. When the maximum is increased to 512 or
>>> 1024 the problem will get worse.
>>
>> Using TLS will penalize every thread in the process, not only EAL
>> threads and registered non-EAL threads, and worse: not only threads
>> that are using the API in question.
>>
>> Every thread will carry the TLS memory around, increasing the process
>> memory footprint.
>>
>> Thread creation will be slower, since TLS memory is allocated *and
>> initialized*, lazy user code-level initialization or not.
>>
>> On my particular Linux x86_64 system, pthread creation overhead looks
>> something like:
>>
>> 8 us w/o any user code-level use of TLS
>> 11 us w/ 16 kB of TLS
>> 314 us w/ 2 MB of TLS.
> 
> Agree that TLS does cause potentially more pages to get allocated on
> thread creation, but that argument doesn't make sense here.

Sure. I was talking about the general concept of replacing per-lcore 
static arrays with TLS.

I find the general applicability of the TLS pattern related because it 
doesn't make sense to have an ad-hoc, opportunistic way to implement 
essentially the same thing across the DPDK code base.

> The rand
> state is small, and DPDK applications should not be creating threads
> after startup. Thread creation is an expensive set of system calls.

I agree, and I would add that non-EAL threads will likely be few in 
numbers, and should all be registered on creation, to assure they can 
call DPDK APIs which require a lcore id.

That said, if application do create threads, DPDK shouldn't make the 
thread creation order of magnitudes slower.


More information about the dev mailing list