[dpdk-dev] [PATCH] memory: fix alignment in eal_get_virtual_area()

Burakov, Anatoly anatoly.burakov at intel.com
Tue Jul 17 11:25:26 CEST 2018


On 17-Jul-18 10:22 AM, Xu, Qian Q wrote:
> 
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Burakov, Anatoly
>> Sent: Monday, July 16, 2018 10:01 PM
>> To: Stojaczyk, DariuszX <dariuszx.stojaczyk at intel.com>; dev at dpdk.org
>> Cc: stable at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] memory: fix alignment in eal_get_virtual_area()
>>
>> On 16-Jul-18 2:29 PM, Stojaczyk, DariuszX wrote:
>>>
>>>> -----Original Message-----
>>>> From: Burakov, Anatoly
>>>> Sent: Monday, July 16, 2018 2:58 PM
>>>> To: Stojaczyk, DariuszX <dariuszx.stojaczyk at intel.com>; dev at dpdk.org
>>>> Cc: stable at dpdk.org
>>>> Subject: Re: [PATCH] memory: fix alignment in eal_get_virtual_area()
>>>>
>>>> On 13-Jun-18 8:08 PM, Dariusz Stojaczyk wrote:
>>>>> Although the alignment mechanism works as intended, the `no_align`
>>>>> bool flag was set incorrectly. We were aligning buffers that didn't
>>>>> need extra alignment, and weren't aligning ones that really needed
>>>>> it.
>>>>>
>>>>> Fixes: b7cc54187ea4 ("mem: move virtual area function in common
>>>>> directory")
>>>>> Cc: anatoly.burakov at intel.com
>>>>> Cc: stable at dpdk.org
>>>>>
>>>>> Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk at intel.com>
>>>>> ---
>>>>>     lib/librte_eal/common/eal_common_memory.c | 2 +-
>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>> index 4f0688f..a7c89f0 100644
>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>> @@ -70,7 +70,7 @@ eal_get_virtual_area(void *requested_addr, size_t
>> *size,
>>>>>     	 * system page size is the same as requested page size.
>>>>>     	 */
>>>>>     	no_align = (requested_addr != NULL &&
>>>>> -		((uintptr_t)requested_addr & (page_sz - 1)) == 0) ||
>>>>> +		((uintptr_t)requested_addr & (page_sz - 1))) ||
>>>>>     		page_sz == system_page_sz;
>>>>>
>>>>>     	do {
>>>>>
>>>>
>>>> This patch is wrong - no alignment should be performed if address is
>>>> already alighed, e.g. if requested_addr & (page_sz - 1) == 0. The
>>>> original code was correct.
>>>
>>> If we provide an aligned address with ADDR_IS_HINT flag and OS decides not
>> to use it, we end up with potentially unaligned address that needs to be
>> manually aligned and that's what this patch does. If the requested address
>> wasn't aligned to the provided page_sz, why would we bother aligning it
>> manually?
>>
>> no_align is a flag that indicates whether we should or shouldn't align the
>> resulting end address - it is not meant to align requested address.
>>
>> If requested_addr was NULL, no_align will be set to "false" (we don't know what
>> we get, so we must reserve additional space for alignment purposes).
>>
>> However, it will be set to "true" if page size is equal to system size (the OS will
>> return pointer that is already aligned to system page size, so we don't need to
>> align the result and thus don't need to reserve additional space for alignment).
>>
>> If requested address wasn't null, again we don't need alignment if system page
>> size is equal to requested page size, as any resulting address will be already
>> page-aligned (hence no_align set to "true").
>>
>> If requested address wasn't already page-aligned and page size is not equal to
>> system page size, then we set "no_align" to false, because we will need to align
>> the resulting address.
>>
>> The crucial part to understand is that the logic here is inverted - "if requested
>> address isn't NULL, and if the requested address is already aligned (i.e. (addr &
>> pgsz-1) == 0), then we *don't* need to align the address". So, if the requested
>> address is *not* aligned, "no_align" must be set to false - because we *will*
>> need to align the address.
>>
>> As an added bonus, we have regression testing identifying this patch as cause for
>> numerous regressions :)
> 
> Yes, we have met many mulit-process related issues(hang, block) due to the patches,
> so that RC1's quality is impacted by this patch seriously.
> How about current fix plan? It's a little urgent. Thx.

Hi Qian,

I've sent a patch to fix this:

http://patches.dpdk.org/project/dpdk/list/?series=607

It was already tested by Lei, but you're welcome to pile on :)

-- 
Thanks,
Anatoly


More information about the dev mailing list