[dpdk-dev] [PATCH] ring: fix unaligned memory access on aarch32

Morten Brørup mb at smartsharesystems.com
Sat Nov 4 17:54:46 CET 2023


> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli at arm.com]
> Sent: Saturday, 4 November 2023 17.32
> 
> > From: Morten Brørup <mb at smartsharesystems.com>
> > Sent: Friday, November 3, 2023 7:04 PM
> >
> > I have for a long time now wondered why the ring functions for
> > enqueue/dequeue of 64-bit objects supports unaligned addresses, and
> now I
> > finally found the patch introducing it.
> >
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Phil Yang
> > > Sent: Monday, 9 March 2020 18.20
> > >
> > > The 32-bit arm machine doesn't support unaligned memory access. It
> > > will cause a bus error on aarch32 with the custom element size
> ring.
> > >
> > > Thread 1 "test" received signal SIGBUS, Bus error.
> > > __rte_ring_enqueue_elems_64 (n=1, obj_table=0xf5edfe41,
> prod_head=0, \
> > > r=0xf5edfb80) at /build/dpdk/build/include/rte_ring_elem.h:177
> > > 177                             ring[idx++] = obj[i++];
> >
> > Which test is this? Why is it using an unaligned array of 64-bit
> objects? (Notice
> > that obj_table=0xf5edfe41.)
> Can't recollect which test it is. I am guessing one of the unit test
> cases. We might have to reinvestigate, not sure why the obj_table is
> unaligned.

Thank you for picking this up, Honnappa.

> 
> >
> > Nobody in their right mind would use an unaligned array of 64-bit
> objects. You
> > can only create such an array if you force the compiler to prevent
> automatic
> > alignment! And all the functions in your application using this array
> would also
> > need to support unaligned addressing of these objects.
> >
> > This seems extremely exotic, and not something any real application
> would do!
> >
> > I would like to revert this patch for performance reasons.
> Can you provide more details? Platform, test, how much is the
> regression?

I haven't seen a regression, but I speculate some performance cost on low-end CPUs. Maybe it is purely academic.

Maybe not purely academic... I just tested on Godbolt, which shows different code generated:

uint64_t fa(void *p)
{
    return *(uint64_t *)p;
}

uint64_t fu(void *p)
{
    return *(unaligned_uint64_t *)p;
}

Generates different output:

fa:
        ldrd    r0, [r0]
        bx      lr

fu:
        mov     r3, r0
        ldr     r0, [r0]  @ unaligned
        ldr     r1, [r3, #4]      @ unaligned
        bx      lr

> 
> >
> > >
> > > Fixes: cc4b218790f6 ("ring: support configurable element size")
> > >
> > > Signed-off-by: Phil Yang <phil.yang at arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > ---
> > >  lib/librte_ring/rte_ring_elem.h | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 3976757..663addc 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -160,7 +160,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r,
> > > uint32_t prod_head,
> > >  	const uint32_t size = r->size;
> > >  	uint32_t idx = prod_head & r->mask;
> > >  	uint64_t *ring = (uint64_t *)&r[1];
> > > -	const uint64_t *obj = (const uint64_t *)obj_table;
> > > +	const unaligned_uint64_t *obj = (const unaligned_uint64_t
> > > *)obj_table;
> > >  	if (likely(idx + n < size)) {
> > >  		for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> > >  			ring[idx] = obj[i];
> > > @@ -294,7 +294,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r,
> > > uint32_t prod_head,
> > >  	const uint32_t size = r->size;
> > >  	uint32_t idx = prod_head & r->mask;
> > >  	uint64_t *ring = (uint64_t *)&r[1];
> > > -	uint64_t *obj = (uint64_t *)obj_table;
> > > +	unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
> > >  	if (likely(idx + n < size)) {
> > >  		for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
> > >  			obj[i] = ring[idx];
> > > --
> > > 2.7.4
> > >
> >
> > References:
> >
> https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring_elem.h?id=3ba
> 514
> > 78a3ab3132c33effc8b132641233275b36
> > https://patchwork.dpdk.org/project/dpdk/patch/1583774395-10233-1-git-
> > send-email-phil.yang at arm.com/



More information about the dev mailing list