[dpdk-dev] [PATCH v4 1/4] mempool: add event callbacks
Olivier Matz
olivier.matz at 6wind.com
Fri Oct 15 14:12:06 CEST 2021
Hi Dmitry,
On Wed, Oct 13, 2021 at 02:01:28PM +0300, Dmitry Kozlyuk wrote:
> Data path performance can benefit if the PMD knows which memory it will
> need to handle in advance, before the first mbuf is sent to the PMD.
> It is impractical, however, to consider all allocated memory for this
> purpose. Most often mbuf memory comes from mempools that can come and
> go. PMD can enumerate existing mempools on device start, but it also
> needs to track creation and destruction of mempools after the forwarding
> starts but before an mbuf from the new mempool is sent to the device.
>
> Add an API to register callback for mempool life cycle events:
> * rte_mempool_event_callback_register()
> * rte_mempool_event_callback_unregister()
> Currently tracked events are:
> * RTE_MEMPOOL_EVENT_READY (after populating a mempool)
> * RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
> Provide a unit test for the new API.
> The new API is internal, because it is primarily demanded by PMDs that
> may need to deal with any mempools and do not control their creation,
> while an application, on the other hand, knows which mempools it creates
> and doesn't care about internal mempools PMDs might create.
>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk at nvidia.com>
> Acked-by: Matan Azrad <matan at nvidia.com>
> ---
> app/test/test_mempool.c | 209 ++++++++++++++++++++++++++++++++++++++
> lib/mempool/rte_mempool.c | 137 +++++++++++++++++++++++++
> lib/mempool/rte_mempool.h | 61 +++++++++++
> lib/mempool/version.map | 8 ++
> 4 files changed, 415 insertions(+)
(...)
> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c
> @@ -42,6 +42,18 @@ static struct rte_tailq_elem rte_mempool_tailq = {
> };
> EAL_REGISTER_TAILQ(rte_mempool_tailq)
>
> +TAILQ_HEAD(mempool_callback_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem callback_tailq = {
> + .name = "RTE_MEMPOOL_CALLBACK",
> +};
> +EAL_REGISTER_TAILQ(callback_tailq)
> +
> +/* Invoke all registered mempool event callbacks. */
> +static void
> +mempool_event_callback_invoke(enum rte_mempool_event event,
> + struct rte_mempool *mp);
> +
> #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
> #define CALC_CACHE_FLUSHTHRESH(c) \
> ((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER))
> @@ -360,6 +372,10 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
> STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> mp->nb_mem_chunks++;
>
> + /* Report the mempool as ready only when fully populated. */
> + if (mp->populated_size >= mp->size)
> + mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp);
> +
One small comment here. I think it does not happen today, but in the
future, something that could happen is:
- create empty mempool
- populate mempool
- use mempool
- populate mempool with more objects
- use mempool
I've seen one usage there: https://www.youtube.com/watch?v=SzQFn9tm4Sw
In that case, it would require a POPULATE event instead of a
MEMPOOL_CREATE event.
Enhancing the documentation to better explain when the callback is
invoked looks enough to me for the moment.
> rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque);
> return i;
>
> @@ -722,6 +738,7 @@ rte_mempool_free(struct rte_mempool *mp)
> }
> rte_mcfg_tailq_write_unlock();
>
> + mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp);
> rte_mempool_trace_free(mp);
> rte_mempool_free_memchunks(mp);
> rte_mempool_ops_free(mp);
> @@ -1343,3 +1360,123 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
>
> rte_mcfg_mempool_read_unlock();
> }
> +
> +struct mempool_callback {
> + rte_mempool_event_callback *func;
> + void *user_data;
> +};
> +
> +static void
> +mempool_event_callback_invoke(enum rte_mempool_event event,
> + struct rte_mempool *mp)
> +{
> + struct mempool_callback_list *list;
> + struct rte_tailq_entry *te;
> + void *tmp_te;
> +
> + rte_mcfg_tailq_read_lock();
> + list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> + RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> + struct mempool_callback *cb = te->data;
> + rte_mcfg_tailq_read_unlock();
> + cb->func(event, mp, cb->user_data);
> + rte_mcfg_tailq_read_lock();
I think it is dangerous to unlock the list before invoking the callback.
During that time, another thread can remove the next mempool callback, and
the next iteration will access to a freed element, causing an undefined
behavior.
Is it a problem to keep the lock held during the callback invocation?
I see that you have a test for this, and that you wrote a comment in the
documentation:
* rte_mempool_event_callback_register() may be called from within the callback,
* but the callbacks registered this way will not be invoked for the same event.
* rte_mempool_event_callback_unregister() may only be safely called
* to remove the running callback.
But is there a use-case for this?
If no, I'll tend to say that we can document that it is not allowed to
create, free or list mempools or register cb from the callback.
> + }
> + rte_mcfg_tailq_read_unlock();
> +}
> +
> +int
> +rte_mempool_event_callback_register(rte_mempool_event_callback *func,
> + void *user_data)
> +{
> + struct mempool_callback_list *list;
> + struct rte_tailq_entry *te = NULL;
> + struct mempool_callback *cb;
> + void *tmp_te;
> + int ret;
> +
> + if (func == NULL) {
> + rte_errno = EINVAL;
> + return -rte_errno;
> + }
> +
> + rte_mcfg_mempool_read_lock();
> + rte_mcfg_tailq_write_lock();
> +
> + list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> + RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> + struct mempool_callback *cb =
> + (struct mempool_callback *)te->data;
> + if (cb->func == func && cb->user_data == user_data) {
> + ret = -EEXIST;
> + goto exit;
> + }
> + }
> +
> + te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
> + if (te == NULL) {
> + RTE_LOG(ERR, MEMPOOL,
> + "Cannot allocate event callback tailq entry!\n");
> + ret = -ENOMEM;
> + goto exit;
> + }
> +
> + cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0);
> + if (cb == NULL) {
> + RTE_LOG(ERR, MEMPOOL,
> + "Cannot allocate event callback!\n");
> + rte_free(te);
> + ret = -ENOMEM;
> + goto exit;
> + }
> +
> + cb->func = func;
> + cb->user_data = user_data;
> + te->data = cb;
> + TAILQ_INSERT_TAIL(list, te, next);
> + ret = 0;
> +
> +exit:
> + rte_mcfg_tailq_write_unlock();
> + rte_mcfg_mempool_read_unlock();
> + rte_errno = -ret;
> + return ret;
> +}
> +
> +int
> +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> + void *user_data)
> +{
> + struct mempool_callback_list *list;
> + struct rte_tailq_entry *te = NULL;
> + struct mempool_callback *cb;
> + int ret;
> +
> + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> + rte_errno = EPERM;
> + return -1;
> + }
The help of the register function says
* Callbacks will be invoked in the process that creates the mempool.
So registration is allowed from primary or secondary process. Can't a
secondary process destroys the callback it has loaded?
> +
> + rte_mcfg_mempool_read_lock();
> + rte_mcfg_tailq_write_lock();
I don't understand why there are 2 locks here.
After looking at the code, I think the locking model is already
incorrect in current mempool code:
rte_mcfg_tailq_write_lock() is used in create and free to protect the
access to the mempool tailq
rte_mcfg_mempool_write_lock() is used in create(), to protect from
concurrent creation (with same name for instance), but I doubt it
is absolutly needed, because memzone_reserve is already protected.
rte_mcfg_mempool_read_lock() is used in dump functions, but to me
it should use rte_mcfg_tailq_read_lock() instead.
Currently, doing a dump and a free concurrently can cause a crash
because they are not using the same lock.
In your case, I suggest to use only one lock to protect the callback
list. I think it can be rte_mcfg_tailq_*_lock().
More information about the dev
mailing list