[dpdk-dev] [PATCH] ring: fix unaligned memory access on aarch32

Ruifeng Wang Ruifeng.Wang at arm.com
Mon Nov 13 07:39:44 CET 2023


On 2023/11/10 9:18 PM, Morten Brørup wrote:
> Dear Ruifeng,

Hi Morten,

> +CC: all CPU architecture maintainers,
> 
> I'm trying to figure out the requirements for supporting unaligned memory access in DPDK (specifically the ring library), and need your expert feedback.
> 
> The #define RTE_ARCH_STRICT_ALIGN - which is undocumented, but probably means that CPU memory access must be aligned - is only set by "generic_aarch32".
> 
> So we will assume that all other CPU architectures supported by DPDK can access unaligned memory.
> 
> As discussed in this thread, "generic_aarch32" has special requirements for performing 64-bit load/store at unaligned addresses.

Yes, 64-bit load/store at unaligned addresses causes alignment fault.

> 
> Now comes the big question: Can "generic_aarch32" perform 32-bit load/store at unaligned addresses without similar special requirements? Then the ring library already supports unaligned 32-bit objects, and doesn't need to be fixed in this regard.

Yes, 32-bit load/store support unaligned data accesses to normal memory 
(not device memory). This is documented in Arm Architecture Reference 
Manual.

Thanks,
Ruifeng
> 
> 
> Med venlig hilsen / Kind regards,
> -Morten Brørup
> 
> 
>> From: Morten Brørup [mailto:mb at smartsharesystems.com]
>> Sent: Friday, 10 November 2023 11.44
>>
>>> From: Konstantin Ananyev [mailto:konstantin.ananyev at huawei.com]
>>> Sent: Friday, 10 November 2023 10.45
>>>
>>>> From: Morten Brørup <mb at smartsharesystems.com>
>>>> Sent: Friday, November 10, 2023 9:34 AM
>>>>
>>>> +CC Gavin, reviewed the test case
>>>>
>>>>> From: Ruifeng Wang [mailto:Ruifeng.Wang at arm.com]
>>>>> Sent: Friday, 10 November 2023 09.40
>>>>>
>>>>> On 2023/11/4 8:04 AM, Morten Brørup wrote:
>>>>>> I have for a long time now wondered why the ring functions for
>>>>> enqueue/dequeue of 64-bit objects supports unaligned addresses,
>> and
>>> now
>>>>> I finally found the patch introducing it.
>>>>>>
>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Phil Yang
>>>>>>> Sent: Monday, 9 March 2020 18.20
>>>>>>>
>>>>>>> The 32-bit arm machine doesn't support unaligned memory
>> access.
>>> It
>>>>>>> will cause a bus error on aarch32 with the custom element size
>>> ring.
>>>>>>>
>>>>>>> Thread 1 "test" received signal SIGBUS, Bus error.
>>>>>>> __rte_ring_enqueue_elems_64 (n=1, obj_table=0xf5edfe41,
>>> prod_head=0,
>>>>> \
>>>>>>> r=0xf5edfb80) at /build/dpdk/build/include/rte_ring_elem.h:177
>>>>>>> 177                             ring[idx++] = obj[i++];
>>>>>>
>>>>>> Which test is this? Why is it using an unaligned array of 64-
>> bit
>>>>> objects? (Notice that obj_table=0xf5edfe41.)
>>>>>
>>>>> The test case is:
>>>>>
>>>
>> https://elixir.bootlin.com/dpdk/latest/source/app/test/test_ring.c#L112
>>>>> 1
>>>>> This case deliberately use unaligned objects.
>>>>
>>>> Thank you, Ruifeng.
>>>>
>>>> Honnappa, I see you signed off on the patch introducing the test
>> for
>>> unaligned objects:
>>>>
>>>
>> http://git.dpdk.org/dpdk/commit/app/test/test_ring.c?id=a9fe152363e283d
>>> 4c590ab8e8d51396f2ffab9ff
>>>>
>>>> What was the rationale behind testing support for unaligned object
>>> pointers? Did any applications/customers use unaligned object
>>>> pointers, or is it a purely academic test case?
>>>>
>>>>>
>>>>>>
>>>>>> Nobody in their right mind would use an unaligned array of 64-
>> bit
>>>>> objects. You can only create such an array if you force the
>>> compiler to
>>>>> prevent automatic alignment! And all the functions in your
>>> application
>>>>> using this array would also need to support unaligned addressing
>> of
>>>>> these objects.
>>>
>>> It could be just one elem, not an array.
>>> And the user can use 'packed' struct or so...
>>> Agree, not a common case, but probably still possible.
>>
>> Very good point, Konstantin. A single unaligned object in a packed
>> structure is not as exotic as an unaligned array of objects (which I
>> consider completely unrealistic).
>>
>> If anyone is using an architecture which doesn't support unaligned
>> accesses, I would expect them to completely avoid using unaligned
>> objects. But perhaps you are right; however unlikely, it might happen.
>>
>> If we think this is a real use case, should we add support for
>> unaligned 32 bit objects?
>> (128 bit objects already support unaligned access; they are type casted
>> to void pointer and accessed using memcpy().)
>>
>> And how about the stack library, should it support unaligned objects
>> too?
>>
>>>
>>>>>>
>>>>>> This seems extremely exotic, and not something any real
>>> application
>>>>> would do!
>>>>>>
>>>>>> I would like to revert this patch for performance reasons.
>>>
>>> Morten, could you probably explain first the problems you encountered
>>> with this patch?
>>> You mention about 'performance reasons', so did you notice any real
>>> slowdown?
>>
>> Please check my reply to the same question here:
>> http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35E9EFD3@smarts
>> erver.smartshare.dk/
>>
>>>
>>>>
>>>> I could add an RTE_ASSERT() to verify that the pointer is aligned,
>>> for debugging purposes.
>>>>
>>>>>>
>>>>>>>
>>>>>>> Fixes: cc4b218790f6 ("ring: support configurable element
>> size")
>>>>>>>
>>>>>>> Signed-off-by: Phil Yang <phil.yang at arm.com>
>>>>>>> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
>>>>>>> Reviewed-by: Honnappa Nagarahalli
>> <honnappa.nagarahalli at arm.com>
>>>>>>> ---
>>>>>>>    lib/librte_ring/rte_ring_elem.h | 4 ++--
>>>>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/librte_ring/rte_ring_elem.h
>>>>>>> b/lib/librte_ring/rte_ring_elem.h
>>>>>>> index 3976757..663addc 100644
>>>>>>> --- a/lib/librte_ring/rte_ring_elem.h
>>>>>>> +++ b/lib/librte_ring/rte_ring_elem.h
>>>>>>> @@ -160,7 +160,7 @@ __rte_ring_enqueue_elems_64(struct
>> rte_ring
>>> *r,
>>>>>>> uint32_t prod_head,
>>>>>>>    	const uint32_t size = r->size;
>>>>>>>    	uint32_t idx = prod_head & r->mask;
>>>>>>>    	uint64_t *ring = (uint64_t *)&r[1];
>>>>>>> -	const uint64_t *obj = (const uint64_t *)obj_table;
>>>>>>> +	const unaligned_uint64_t *obj = (const unaligned_uint64_t
>>>>>>> *)obj_table;
>>>>>>>    	if (likely(idx + n < size)) {
>>>>>>>    		for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
>>>>>>>    			ring[idx] = obj[i];
>>>>>>> @@ -294,7 +294,7 @@ __rte_ring_dequeue_elems_64(struct
>> rte_ring
>>> *r,
>>>>>>> uint32_t prod_head,
>>>>>>>    	const uint32_t size = r->size;
>>>>>>>    	uint32_t idx = prod_head & r->mask;
>>>>>>>    	uint64_t *ring = (uint64_t *)&r[1];
>>>>>>> -	uint64_t *obj = (uint64_t *)obj_table;
>>>>>>> +	unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
>>>>>>>    	if (likely(idx + n < size)) {
>>>>>>>    		for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
>>>>>>>    			obj[i] = ring[idx];
>>>>>>> --
>>>>>>> 2.7.4
>>>>>>>
>>>>>>
>>>>>> References:
>>>>>>
>>>>>
>>>
>> https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring_elem.h?id=3ba
>>>>> 51478a3ab3132c33effc8b132641233275b36
>>>>>> https://patchwork.dpdk.org/project/dpdk/patch/1583774395-10233-
>> 1-
>>> git-
>>>>> send-email-phil.yang at arm.com/
>>>>>>


More information about the dev mailing list