[PATCH v3] net/intel: optimize for fast-free hint

Loftus, Ciara ciara.loftus at intel.com
Thu May 28 15:23:00 CEST 2026


> Subject: [PATCH v3] net/intel: optimize for fast-free hint
> 

snip

> diff --git a/drivers/net/intel/common/tx_scalar.h
> b/drivers/net/intel/common/tx_scalar.h
> index 9fcd2e4733..d27df34dfa 100644
> --- a/drivers/net/intel/common/tx_scalar.h
> +++ b/drivers/net/intel/common/tx_scalar.h
> @@ -197,16 +197,64 @@ ci_tx_xmit_cleanup(struct ci_tx_queue *txq)
>  	const uint16_t rs_idx = (last_desc_cleaned == nb_tx_desc - 1) ?
>  			0 :
>  			(last_desc_cleaned + 1) >> txq->log2_rs_thresh;
> -	uint16_t desc_to_clean_to = (rs_idx << txq->log2_rs_thresh) + (txq-
> >tx_rs_thresh - 1);
> +	const uint16_t dd_idx = txq->rs_last_id[rs_idx];
> +	const uint16_t first_to_clean = rs_idx << txq->log2_rs_thresh;
> 
> -	/* Check if descriptor is done  */
> -	if ((txd[txq->rs_last_id[rs_idx]].cmd_type_offset_bsz &
> -			rte_cpu_to_le_64(CI_TXD_QW1_DTYPE_M)) !=
> -
> 	rte_cpu_to_le_64(CI_TX_DESC_DTYPE_DESC_DONE))
> +	/* Check if descriptor is done - all drivers use 0xF as done value in bits
> 3:0 */
> +	if ((txd[dd_idx].cmd_type_offset_bsz &
> rte_cpu_to_le_64(CI_TXD_QW1_DTYPE_M)) !=
> +			rte_cpu_to_le_64(CI_TX_DESC_DTYPE_DESC_DONE))
> +		/* Descriptor not yet processed by hardware */
>  		return -1;
> 
> +	/* DD bit is set, descriptors are done. Now free the mbufs. */
> +	/* Note: nb_tx_desc is guaranteed to be a multiple of tx_rs_thresh,
> +	 * validated during queue setup. This means cleanup never wraps
> around
> +	 * the ring within a single burst (e.g., ring=256, rs_thresh=32 gives
> +	 * bursts of 0-31, 32-63, ..., 224-255).
> +	 */
> +	const uint16_t nb_to_clean = txq->tx_rs_thresh;
> +	struct ci_tx_entry *sw_ring = txq->sw_ring;
> +
> +	/* fast_free_mp is NULL only when the fast free is disabled*/
> +	if (txq->fast_free_mp != NULL) {
> +		/* FAST_FREE path: mbufs are already reset, just return to
> pool */
> +		struct rte_mbuf *free[CI_TX_MAX_FREE_BUF_SZ];
> +		uint16_t nb_free = 0;
> +
> +		/* Get cached mempool pointer, or cache it on first use */
> +		struct rte_mempool *mp =
> +			likely(txq->fast_free_mp != (void *)UINTPTR_MAX) ?
> +			txq->fast_free_mp :
> +			(txq->fast_free_mp = sw_ring[dd_idx].mbuf->pool);

Is the mbuf you are dereferencing here guaranteed to be non NULL?

Other than that concern:

Acked-by: Ciara Loftus <ciara.loftus at intel.com>

> +
> +		/* Pack non-NULL mbufs in-place at start of sw_ring range.
> +		 * No modulo needed in loop since we're guaranteed not to
> wrap.
> +		 */
> +		for (uint16_t i = 0; i < nb_to_clean; i++) {



More information about the dev mailing list