[PATCH] event/eth_tx: prefetch mbuf headers

Mattias Rönnblom hofors at lysator.liu.se
Mon Jul 7 13:57:25 CEST 2025


On 2025-07-07 11:00, Naga Harish K, S V wrote:
> 
> 
>> -----Original Message-----
>> From: Mattias Rönnblom <hofors at lysator.liu.se>
>> Sent: Thursday, July 3, 2025 1:50 AM
>> To: Naga Harish K, S V <s.v.naga.harish.k at intel.com>; Mattias Rönnblom
>> <mattias.ronnblom at ericsson.com>; dev at dpdk.org
>> Cc: Jerin Jacob <jerinj at marvell.com>; Peter Nilsson
>> <peter.j.nilsson at ericsson.com>
>> Subject: Re: [PATCH] event/eth_tx: prefetch mbuf headers
>>
>> On 2025-05-27 12:55, Naga Harish K, S V wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>> Sent: Friday, March 28, 2025 11:14 AM
>>>> To: dev at dpdk.org
>>>> Cc: Mattias Rönnblom <hofors at lysator.liu.se>; Naga Harish K, S V
>>>> <s.v.naga.harish.k at intel.com>; Jerin Jacob <jerinj at marvell.com>;
>>>> Mattias Rönnblom <mattias.ronnblom at ericsson.com>; Peter Nilsson
>>>> <peter.j.nilsson at ericsson.com>
>>>> Subject: [PATCH] event/eth_tx: prefetch mbuf headers
>>>>
>>>> Prefetch mbuf headers, resulting in ~10% throughput improvement when
>>>> the Ethernet RX and TX Adapters are hosted on the same core (likely
>>>> ~2x in case a dedicated TX core is used).
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>> Tested-by: Peter Nilsson <peter.j.nilsson at ericsson.com>
>>>> ---
>>>>    lib/eventdev/rte_event_eth_tx_adapter.c | 20 ++++++++++++++++++++
>>>>    1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/lib/eventdev/rte_event_eth_tx_adapter.c
>>>> b/lib/eventdev/rte_event_eth_tx_adapter.c
>>>> index 67fff8b7d6..d740ae00f9 100644
>>>> --- a/lib/eventdev/rte_event_eth_tx_adapter.c
>>>> +++ b/lib/eventdev/rte_event_eth_tx_adapter.c
>>>> @@ -598,6 +598,12 @@ txa_process_event_vector(struct
>> txa_service_data
>>>> *txa,
>>>>    	return nb_tx;
>>>>    }
>>>>
>>>> +static inline void
>>>> +txa_prefetch_mbuf(struct rte_mbuf *mbuf) {
>>>> +	rte_mbuf_prefetch_part1(mbuf);
>>>> +}
>>>> +
>>>>    static void
>>>>    txa_service_tx(struct txa_service_data *txa, struct rte_event *ev,
>>>>    	uint32_t n)
>>>> @@ -608,6 +614,20 @@ txa_service_tx(struct txa_service_data *txa,
>>>> struct rte_event *ev,
>>>>
>>>>    	stats = &txa->stats;
>>>>
>>>> +	for (i = 0; i < n; i++) {
>>>> +		struct rte_event *event = &ev[i];
>>>> +
>>>> +		if (unlikely(event->event_type & RTE_EVENT_TYPE_VECTOR))
>>>
>>>
>>> This gives a branch prediction advantage to non-vector events. Is that the
>> intention?
>>>
>>
>> Yes.
> 
> I think all event-types need to be equally weighted. My ask was to remove the "unlikely" for vector events.
> 

This is not possible. One branch will always be cheaper. If you leave 
out unlikely()/likely(), you leave all control to compiler heuristics. 
In this case, I think the resulting object code will be identical (on GCC).

RTE_EVENT_TYPE_VECTOR will result in fewer events, and thus the 
per-event overhead is less of an issue. So if you weigh the importance 
of vector and non-vector use cases equally, you should optimize for the 
non-vector case.

>>
>>>> {
>>>> +			struct rte_event_vector *vec = event->vec;
>>>> +			struct rte_mbuf **mbufs = vec->mbufs;
>>>> +			uint32_t k;
>>>> +
>>>> +			for (k = 0; k < vec->nb_elem; k++)
>>>> +				txa_prefetch_mbuf(mbufs[k]);
>>>> +		} else
>>>> +			txa_prefetch_mbuf(event->mbuf);
>>>> +	}
>>>> +
>>>>    	nb_tx = 0;
>>>>    	for (i = 0; i < n; i++) {
>>>>    		uint16_t port;
>>>> --
>>>> 2.43.0
>>>
> 



More information about the dev mailing list