[PATCH v5 3/7] mbuf: record mbuf operations history

Thomas Monjalon thomas at monjalon.net
Tue Oct 14 14:03:43 CEST 2025


14/10/2025 11:59, Morten Brørup:
> From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > --- /dev/null
> > +++ b/lib/mbuf/mbuf_history.c
> > @@ -0,0 +1,227 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2024 NVIDIA Corporation & Affiliates
> 
> Suggest: 2024 -> 2025
> Also in other files.

No, it is the year of initial write.
It has been used internally before pushing upstream.


[...]
> > +#define HISTORY_LAST_MASK (RTE_BIT64(RTE_MBUF_HISTORY_BITS) - 1)
> 
> Various places in the code has something like:
> +	last_op = history & HISTORY_LAST_MASK;
> +	RTE_ASSERT(last_op < RTE_MBUF_HISTORY_OP_MAX);

There is only one.
Other asserts are validating function parameters to be of the right size.

> Suggest replacing those by adding a static_assert here instead:
> static_assert(RTE_MBUF_HISTORY_OP_MAX == HISTORY_LAST_MASK + 1, "Op size mismatch")

It is just checking RTE_MBUF_HISTORY_BITS and RTE_MBUF_HISTORY_OP_MAX are in sync.
Honestly I don't see a real benefit, but I will add
RTE_BUILD_BUG_ON(RTE_MBUF_HISTORY_OP_MAX > HISTORY_LAST_MASK + 1);


[...]
> > +static void
> > +mbuf_history_count_stats_and_print(struct rte_mempool *mp
> > __rte_unused,
> > +		void *opaque, void *obj, unsigned obj_idx __rte_unused)
> 
> Fix:
> unsigned -> unsigned int

It is the definition:
typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
        void *opaque, void *obj, unsigned obj_idx);

> 
> > +{
> > +	struct count_and_print_ctx *ctx = (struct count_and_print_ctx
> > *)opaque;
> > +	struct rte_mbuf *m = (struct rte_mbuf *)obj;
> > +	uint64_t history, last_op;
> 
> Suggest using the enum type for operation variables:
> uint64_t history;
> enum rte_mbuf_history_op last_op;
> 
> Also elsewhere in the code.

Yes it would be better to use the right type.


[...]
> > @@ -667,6 +672,7 @@ rte_mbuf_raw_free(struct rte_mbuf *m)
> >  {
> >  	__rte_mbuf_raw_sanity_check(m);
> >  	rte_mempool_put(m->pool, m);
> > +	rte_mbuf_history_mark(m, RTE_MBUF_HISTORY_OP_LIB_FREE);
> 
> Fix: For improved race protection, mark the mbuf before actually freeing it, like this:
>  	__rte_mbuf_raw_sanity_check(m);
> +	rte_mbuf_history_mark(m, RTE_MBUF_HISTORY_OP_LIB_FREE);
>  	rte_mempool_put(m->pool, m);

The race is about a debugging mark.
The benefit of marking after is that the last mark is the upper level,
so we can easily distinguish between a free initiated by the app or the PMD.


[...]
> > @@ -701,6 +707,7 @@ rte_mbuf_raw_free_bulk(struct rte_mempool *mp,
> > struct rte_mbuf **mbufs, unsigned
> >  		RTE_ASSERT(m != NULL);
> >  		RTE_ASSERT(m->pool == mp);
> >  		__rte_mbuf_raw_sanity_check(m);
> > +		rte_mbuf_history_mark(mbufs[idx],
> > RTE_MBUF_HISTORY_OP_LIB_FREE);
> >  	}
> 
> Fix: The loop is normally omitted, so use rte_mbuf_history_mark_bulk() here instead of rte_mbuf_history_mark() inside the loop.
> It also makes the code easier to read.

OK


[...]
> > +enum rte_mbuf_history_op {
> > +	RTE_MBUF_HISTORY_OP_NEVER     =  0, /**< Initial state - never
> > allocated */
> > +	RTE_MBUF_HISTORY_OP_LIB_FREE  =  1, /**< Freed back to pool */
> > +	RTE_MBUF_HISTORY_OP_PMD_FREE  =  2, /**< Freed by PMD */
> > +	RTE_MBUF_HISTORY_OP_APP_FREE  =  3, /**< Freed by application */
> > +	RTE_MBUF_HISTORY_OP_LIB_ALLOC =  4, /**< Allocation in mbuf
> > library */
> > +	RTE_MBUF_HISTORY_OP_PMD_ALLOC =  5, /**< Allocated by PMD for Rx
> > */
> > +	RTE_MBUF_HISTORY_OP_APP_ALLOC =  6, /**< Allocated by application
> > */
> > +	RTE_MBUF_HISTORY_OP_RX        =  7, /**< Received */
> > +	RTE_MBUF_HISTORY_OP_TX        =  8, /**< Sent */
> > +	RTE_MBUF_HISTORY_OP_PREP_TX   =  9, /**< Being prepared before Tx
> > */
> > +	RTE_MBUF_HISTORY_OP_BUSY_TX   = 10, /**< Returned due to Tx busy
> > */
> 
> Suggest:
> RTE_MBUF_HISTORY_OP_PREP_TX -> RTE_MBUF_HISTORY_OP_TX_PREP
> RTE_MBUF_HISTORY_OP_BUSY_TX -> RTE_MBUF_HISTORY_OP_TX_BUSY

OK

> > +	RTE_MBUF_HISTORY_OP_ENQUEUE   = 11, /**< Enqueued for processing
> > */
> > +	RTE_MBUF_HISTORY_OP_DEQUEUE   = 12, /**< Dequeued for processing
> > */
> > +	/*                              13,      reserved for future */
> > +	RTE_MBUF_HISTORY_OP_USR2      = 14, /**< Application-defined
> > event 2 */
> > +	RTE_MBUF_HISTORY_OP_USR1      = 15, /**< Application-defined
> > event 1 */
> > +	RTE_MBUF_HISTORY_OP_MAX       = 16, /**< Maximum number of
> > operation types */
> > +};
> 
> Suggest adding:
> static_assert(RTE_MBUF_HISTORY_OP_MAX == 1 << RTE_MBUF_HISTORY_BITS, "Enum vs bitsize mismatch");

Done with RTE_BUILD_BUG_ON(RTE_MBUF_HISTORY_OP_MAX > HISTORY_LAST_MASK + 1);
in next version.

No need to test equality.
Being more tolerant allows playing with tuning easily.


[...]
> > +/**
> > + * Mark an mbuf with a history event.
> > + *
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * @param m
> > + *   Pointer to the mbuf.
> > + * @param op
> > + *   The operation to record.
> > + */
> > +static inline void rte_mbuf_history_mark(struct rte_mbuf *m, uint32_t
> > op)
> 
> Fix:
> uint32_t op -> enum rte_mbuf_history_op op

Yes


[...]
> > +/**
> > + * Mark multiple mbufs with a history event.
> > + *
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * @param mbufs
> > + *   Array of mbuf pointers.
> > + * @param n
> > + *   Number of mbufs to mark.
> > + * @param op
> > + *   The operation to record.
> > + */
> > +static inline void rte_mbuf_history_mark_bulk(struct rte_mbuf * const
> > *mbufs,
> > +		uint32_t n, uint32_t op)
> 
> Fix:
> uint32_t n -> unsigned int count (for consistency)
> uint32_t op -> enum rte_mbuf_history_op op

Yes


[...]
> With fixes (suggestions optional),
> Acked-by: Morten Brørup <mb at smartsharesystems.com>

Thanks for the very detailed review.




More information about the dev mailing list