[PATCH v8 00/18] Simplify running with high-numbered CPUs
Bruce Richardson
bruce.richardson at intel.com
Tue Oct 7 17:30:41 CEST 2025
On Mon, Oct 06, 2025 at 09:41:45AM +0100, Bruce Richardson wrote:
> On Mon, Oct 06, 2025 at 09:42:20AM +0200, Morten Brørup wrote:
> > Bruce,
> >
> > cpu_set_t is a fixed size array, with the size CPU_SETSIZE.
> > This is part of libc.
> >
> > However, the kernel may be built with support for a different number of CPUs.
> > This means that e.g. sched_getaffinity() will fail with EINVAL if cpusetsize is smaller than the number of CPUs the kernel was built for.
> > For more information, refer to the description about "Dynamically sized CPU sets" here:
> > https://linux.die.net/man/3/cpu_alloc_size
> >
> > With this series, consider ensuring that all DPDK functions taking a cpu set parameter also take a cpu set size parameter, to align with kernel APIs.
> > E.g.:
> > int sched_getaffinity(pid_t pid,
> > size_t cpusetsize, cpu_set_t *mask), and
> > void CPU_ZERO_S(size_t setsize, cpu_set_t *set),
> > instead of:
> > void CPU_ZERO(cpu_set_t *set).
> >
>
> Ok, that seems a feasible change.
>
> > <feature creep>
> > Consider not using fixed size cpu sets (and CPU_SETSIZE) at all, but dynamically allocated cpu sets.
> > </feature creep>
> >
>
> Will have to see what that involves. If it's fairly easy, I'll give it a
> go, but we are rather late in this release cycle, so if it's taking too
> long I'll have to give this bit a miss.
>
Looking at these requests, while initially it seems easy, there is actually
enough work in it, that I don't think I can feasibly implement and still
have this work in this release. For example:
* the cpuset macros aren't present on windows, so we need to add to
rte_os.h the equivalents of them.
- do we do like for RTE_CPU_AND or RTE_CPU_FILL and produce RTE_*
versions?
- if so, can we then take some liberties like making the size parameter
the bitsize rather than the bytesize parameter - which would make use
far easier. [With current CPU_*_S macros in the code I find myself
having to always track two values each time, bitsize for range checking
and bytesize just to pass to the libc macros]
* the argparse library doesn't have the option to take additional
parameters such as a cpuset size when parsing arguments - it's either a
cpuset value or not. [I've prototyped working around that by adding a new
API specifying the expected cpuset size]
* even if all the arg parsing in EAL is set up to handle variably sized cpu
sets, the actual lcore_config data structure would need to be set up for
dynamically allocated cpusets to actually handle them.
Given all that, the major blocker is actually the first one - deciding
about putting the macros/inlines for this in rte_os.h. If we are going to
make the variable sized functions available in DPDK, I'd like to have a bit
of thought and discussion about how first. As well as possibly changing
macro parameters, we could even go further and create a wrapper struct for
cpusets which actually tracks the length - thereby possibly making things
simpler to use and less error prone.
/Bruce
More information about the dev
mailing list