[RFC PATCH] eventdev: ensure 16-byte alignment for events

Mattias Rönnblom hofors at lysator.liu.se
Fri Oct 6 14:15:00 CEST 2023


On 2023-10-05 13:51, Bruce Richardson wrote:
> The event structure in DPDK is 16-bytes in size, and events are
> regularly passed as parameters directly rather than being passed as
> pointers.

When are events passed by-value, rather than by-reference? There are no 
such examples in the public eventdev API.

To help compiler optimize correctly, we can explicitly request
> 16-byte alignment for events, which means that we should be able
> to do aligned vector loads/stores (e.g. with SSE or Neon) when working
> with those events.
> 

That change is both helping and sabotaging the optimizer's work. Now 
every stack allocation needs to be 2-byte aligned - in DPDK code, and in 
the application.

The effect this change has on an eventdev app using DSW is a ~3 
cycle/event performance degradation on an AMD Zen 3 system, and a ~4 
cycle/event performance degradation on a Skylake-generation Intel CPU.

What scenarios do you have in mind, where this change would improve the 
generated code? Something where there are no unaligned loads available 
in the ISA, or they are much slower than their aligned counterparts?

When I looked into the same issue for the DPDK IP checksumming routines, 
there basically were no such. Not that I could find.

> Signed-off-by: Bruce Richardson <bruce.richardson at intel.com>
> ---
>   lib/eventdev/rte_eventdev.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> index 2ba8a7b090..bb0d59b059 100644
> --- a/lib/eventdev/rte_eventdev.h
> +++ b/lib/eventdev/rte_eventdev.h
> @@ -1344,7 +1344,7 @@ struct rte_event {
>   		struct rte_event_vector *vec;
>   		/**< Event vector pointer. */
>   	};
> -};
> +} __rte_aligned(16);
>   
>   /* Ethdev Rx adapter capability bitmap flags */
>   #define RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT	0x1


More information about the dev mailing list