[PATCH] event/eth_tx: prefetch mbuf headers

Naga Harish K, S V s.v.naga.harish.k at intel.com
Thu Jul 10 06:34:35 CEST 2025



> -----Original Message-----
> From: Mattias Rönnblom <hofors at lysator.liu.se>
> Sent: Monday, July 7, 2025 5:27 PM
> To: Naga Harish K, S V <s.v.naga.harish.k at intel.com>; Mattias Rönnblom
> <mattias.ronnblom at ericsson.com>; dev at dpdk.org
> Cc: Jerin Jacob <jerinj at marvell.com>; Peter Nilsson
> <peter.j.nilsson at ericsson.com>
> Subject: Re: [PATCH] event/eth_tx: prefetch mbuf headers
> 
> On 2025-07-07 11:00, Naga Harish K, S V wrote:
> >
> >
> >> -----Original Message-----
> >> From: Mattias Rönnblom <hofors at lysator.liu.se>
> >> Sent: Thursday, July 3, 2025 1:50 AM
> >> To: Naga Harish K, S V <s.v.naga.harish.k at intel.com>; Mattias
> >> Rönnblom <mattias.ronnblom at ericsson.com>; dev at dpdk.org
> >> Cc: Jerin Jacob <jerinj at marvell.com>; Peter Nilsson
> >> <peter.j.nilsson at ericsson.com>
> >> Subject: Re: [PATCH] event/eth_tx: prefetch mbuf headers
> >>
> >> On 2025-05-27 12:55, Naga Harish K, S V wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
> >>>> Sent: Friday, March 28, 2025 11:14 AM
> >>>> To: dev at dpdk.org
> >>>> Cc: Mattias Rönnblom <hofors at lysator.liu.se>; Naga Harish K, S V
> >>>> <s.v.naga.harish.k at intel.com>; Jerin Jacob <jerinj at marvell.com>;
> >>>> Mattias Rönnblom <mattias.ronnblom at ericsson.com>; Peter Nilsson
> >>>> <peter.j.nilsson at ericsson.com>
> >>>> Subject: [PATCH] event/eth_tx: prefetch mbuf headers
> >>>>
> >>>> Prefetch mbuf headers, resulting in ~10% throughput improvement
> >>>> when the Ethernet RX and TX Adapters are hosted on the same core
> >>>> (likely ~2x in case a dedicated TX core is used).
> >>>>
> >>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
> >>>> Tested-by: Peter Nilsson <peter.j.nilsson at ericsson.com>

Acked-by: Naga Harish K S V <s.v.naga.harish.k at intel.com>

> >>>> ---
> >>>>    lib/eventdev/rte_event_eth_tx_adapter.c | 20
> ++++++++++++++++++++
> >>>>    1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/lib/eventdev/rte_event_eth_tx_adapter.c
> >>>> b/lib/eventdev/rte_event_eth_tx_adapter.c
> >>>> index 67fff8b7d6..d740ae00f9 100644
> >>>> --- a/lib/eventdev/rte_event_eth_tx_adapter.c
> >>>> +++ b/lib/eventdev/rte_event_eth_tx_adapter.c
> >>>> @@ -598,6 +598,12 @@ txa_process_event_vector(struct
> >> txa_service_data
> >>>> *txa,
> >>>>    	return nb_tx;
> >>>>    }
> >>>>
> >>>> +static inline void
> >>>> +txa_prefetch_mbuf(struct rte_mbuf *mbuf) {
> >>>> +	rte_mbuf_prefetch_part1(mbuf);
> >>>> +}
> >>>> +
> >>>>    static void
> >>>>    txa_service_tx(struct txa_service_data *txa, struct rte_event *ev,
> >>>>    	uint32_t n)
> >>>> @@ -608,6 +614,20 @@ txa_service_tx(struct txa_service_data *txa,
> >>>> struct rte_event *ev,
> >>>>
> >>>>    	stats = &txa->stats;
> >>>>
> >>>> +	for (i = 0; i < n; i++) {
> >>>> +		struct rte_event *event = &ev[i];
> >>>> +
> >>>> +		if (unlikely(event->event_type & RTE_EVENT_TYPE_VECTOR))
> >>>
> >>>
> >>> This gives a branch prediction advantage to non-vector events. Is
> >>> that the
> >> intention?
> >>>
> >>
> >> Yes.
> >
> > I think all event-types need to be equally weighted. My ask was to remove
> the "unlikely" for vector events.
> >
> 
> This is not possible. One branch will always be cheaper. If you leave out
> unlikely()/likely(), you leave all control to compiler heuristics.
> In this case, I think the resulting object code will be identical (on GCC).
> 
> RTE_EVENT_TYPE_VECTOR will result in fewer events, and thus the per-event
> overhead is less of an issue. So if you weigh the importance of vector and non-
> vector use cases equally, you should optimize for the non-vector case.
> 

Fine, agreed.

> >>
> >>>> {
> >>>> +			struct rte_event_vector *vec = event->vec;
> >>>> +			struct rte_mbuf **mbufs = vec->mbufs;
> >>>> +			uint32_t k;
> >>>> +
> >>>> +			for (k = 0; k < vec->nb_elem; k++)
> >>>> +				txa_prefetch_mbuf(mbufs[k]);
> >>>> +		} else
> >>>> +			txa_prefetch_mbuf(event->mbuf);
> >>>> +	}
> >>>> +
> >>>>    	nb_tx = 0;
> >>>>    	for (i = 0; i < n; i++) {
> >>>>    		uint16_t port;
> >>>> --
> >>>> 2.43.0
> >>>
> >



More information about the dev mailing list