[PATCH v2 1/6] eal: add static per-lcore memory allocation facility
Morten Brørup
mb at smartsharesystems.com
Mon Oct 14 09:56:06 CEST 2024
> From: Jerin Jacob [mailto:jerinjacobk at gmail.com]
> Sent: Wednesday, 18 September 2024 12.12
>
> On Thu, Sep 12, 2024 at 8:52 PM Jerin Jacob <jerinjacobk at gmail.com>
> wrote:
> >
> > On Thu, Sep 12, 2024 at 7:11 PM Morten Brørup
> <mb at smartsharesystems.com> wrote:
> > >
> > > > From: Jerin Jacob [mailto:jerinjacobk at gmail.com]
> > > > Sent: Thursday, 12 September 2024 15.17
> > > >
> > > > On Thu, Sep 12, 2024 at 2:40 PM Morten Brørup
> <mb at smartsharesystems.com>
> > > > wrote:
> > > > >
> > > > > > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR *
> RTE_MAX_LCORE)
> > > > >
> > > > > Considering hugepages...
> > > > >
> > > > > Lcore variables may be allocated before DPDK's memory allocator
> > > > (rte_malloc()) is ready, so rte_malloc() cannot be used for lcore
> variables.
> > > > >
> > > > > And lcore variables are not usable (shared) for DPDK multi-
> process, so the
> > > > lcore_buffer could be allocated through the O/S APIs as anonymous
> hugepages,
> > > > instead of using rte_malloc().
> > > > >
> > > > > The alternative, using rte_malloc(), would disallow allocating
> lcore
> > > > variables before DPDK's memory allocator has been initialized,
> which I think
> > > > is too late.
> > > >
> > > > I thought it is not. A lot of the subsystems are initialized
> after the
> > > > memory subsystem is initialized.
> > > > [1] example given in documentation. I thought, RTE_INIT needs to
> > > > replaced if the subsystem called after memory initialized (which
> is
> > > > the case for most of the libraries)
> > >
> > > The list of RTE_INIT functions are called before main(). It is not
> very useful.
> > >
> > > Yes, it would be good to replace (or supplement) RTE_INIT_PRIO by
> something similar, which calls the list of "INIT" functions at the
> appropriate time during EAL initialization.
> > >
> > > DPDK should then use this "INIT" list for all its initialization,
> so the init function of new features (such as this, and trace) can be
> inserted at the correct location in the list.
> > >
> > > > Trace library had a similar situation. It is managed like [2]
> > >
> > > Yes, if we insist on using rte_malloc() for lcore variables, the
> alternative is to prohibit establishing lcore variables in functions
> called through RTE_INIT.
> >
> > I was not insisting on using ONLY rte_malloc(). Since rte_malloc()
> can
> > be called before rte_eal_init)(it will return NULL). Alloc routine
> can
> > check first rte_malloc() is available if not switch over glibc.
>
>
> @Mattias Rönnblom This comment is not addressed in v7. Could you check?
Mattias, following up on Jerin's suggestion:
When allocating an lcore variable, and the buffer holding lcore variables is out of space (or was never allocated), a new buffer is allocated.
Here's the twist I think Jerin is asking for:
You could check if rte_malloc() is available, and use that (instead of the heap) when allocating a new buffer holding lcore variables.
This check can be performed (aggressively) when allocating a new lcore variable, or (conservatively) only when allocating a new buffer.
Now, if using hugepages, the value of RTE_MAX_LCORE_VAR (the maximum size of one lcore variable instance) becomes more important.
Let's consider systems with 2 MB hugepages:
If it supports two lcores (RTE_MAX_LCORE is 2), the current RTE_MAX_LCORE_VAR default of 1 MB is a perfect match; it will use 2 MB of RAM as one 2 MB hugepage.
If it supports 128 lcores, the current RTE_MAX_LCORE_VAR default of 1 MB will use 128 MB of RAM.
If we scale it back, so it only uses one 2 MB hugepage, RTE_MAX_LCORE_VAR will have to be 2 MB / 128 lcores = 16 KB.
16 KB might be too small. E.g. a mempool cache uses 2 * 512 * sizeof(void *) = 8 KB + a few bytes for the information about the cache. So I can easily point at one example where 16 KB is going very close to the edge.
So, as you already asked, what is a reasonable default minimum value of RTE_MAX_LCORE_VAR?
Maybe we should just stick with your initial suggestion (1 MB) and see how it goes.
<roadmap>
At the recent DPDK Summit, we discussed memory consumption in one of the workshops.
One of the possible means for reducing memory consumption is making RTE_MAX_LCORE dynamic, so an application using only a few cores will scale its per-lcore tables to the actual number of lcores, instead of scaling to some hardcoded maximum.
With this in mind, I'm less worried about the RTE_MAX_LCORE multiplier.
</roadmap>
More information about the dev
mailing list