[dpdk-dev] Memory allocated using rte_zmalloc() has non-zeros
Andrew Rybchenko
arybchenko at solarflare.com
Thu Jul 19 18:44:43 CEST 2018
On 19.07.2018 12:48, Burakov, Anatoly wrote:
> On 19-Jul-18 10:01 AM, Burakov, Anatoly wrote:
>> On 18-Jul-18 9:58 PM, Stephen Hemminger wrote:
>>> On Wed, 18 Jul 2018 22:52:12 +0300
>>> Andrew Rybchenko <arybchenko at solarflare.com> wrote:
>>>
>>>> On 18.07.2018 20:18, Burakov, Anatoly wrote:
>>>>> On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote:
>>>>>> Hi Anatoly,
>>>>>>
>>>>>> I'm investigating issue which finally comes to the fact that memory
>>>>>> allocated using
>>>>>> rte_zmalloc() has non zeros.
>>>>>>
>>>>>> If I add memset just after allocation, everything is perfect and
>>>>>> works fine.
>>>>>>
>>>>>> I've found out that memset was removed from rte_zmalloc_socket()
>>>>>> some
>>>>>> time ago:
>>>>>> >>>
>>>>>> commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5
>>>>>> Author: Sergio Gonzalez Monroy <sergio.gonzalez.monroy at intel.com>
>>>>>> Date: Tue Jul 5 12:01:16 2016 +0100
>>>>>>
>>>>>> mem: do not zero out memory on zmalloc
>>>>>>
>>>>>> Zeroing out memory on rte_zmalloc_socket is not required
>>>>>> anymore
>>>>>> since all
>>>>>> allocated memory is already zeroed.
>>>>>>
>>>>>> Signed-off-by: Sergio Gonzalez Monroy
>>>>>> <sergio.gonzalez.monroy at intel.com>
>>>>>> <<<
>>>>>>
>>>>>> but may be something has changed now that made above statement
>>>>>> false.
>>>>>>
>>>>>> I observe the problem when memory is reallocated. I.e. I configure 7
>>>>>> queues,
>>>>>> start, stop, reconfigure to 3 queues, start. Memory is allocated on
>>>>>> start and
>>>>>> freed on stop, since we have less queues on the second start it is
>>>>>> allocated
>>>>>> Andrew Rybchenko <arybchenko at solarflare.com>
>>>>>> in a different way and reuses previously allocated/freed memory.
>>>>>>
>>>>>> Do you have any ideas what could be wrong?
>>>>>>
>>>>>> Andrew.
>>>>>>
>>>>>
>>>>> Hi Andrew,
>>>>>
>>>>> I will look into it first thing tomorrow. In general, we memset(0) on
>>>>> free, and kernel gives us zeroed out pages initially, so the most
>>>>> likely point of failure is that i'm not overwring some malloc headers
>>>>> correctly on free.
>>>>
>>>> OK, at least now I know how it is supposed to work in theory.
>>>>
>>>> The following region was allocated (the second number below is
>>>> pointer
>>>> plus size)
>>>> ALLOC 0x7fffa3264080-0x7fffa32640b8
>>>>
>>>> Not zerod address is 16 bytes before:
>>>> (gdb) p/x ((uint64_t *)0x7fffa3264070)[0]
>>>> $4 = 0x4000000002
>>>> (gdb) p/x ((uint64_t *)0x7fffa3264070)[1]
>>>> $5 = 0x80
>>>>
>>>> then freed
>>>> FREE 0x7fffa3264080-0x7fffa32640b8
>>>>
>>>> but above values (gdb) are still the same
>>>> then it is allocated as the part of bigger memory chunk
>>>> ALLOC 0x7fffa3245b80-0x7fffa3265fd8
>>>> which should contain zeros, but above values are still the same.
>>>>
>>>> It is interesting that it looks like it was the first block freed
>>>> on the
>>>> port stop. I'm not 100% sure since I've put printouts to my allocation
>>>> wrapper, not EAL.
>>>>
>>>> Many thanks,
>>>> Andrew.
>>>
>>> memset here is what is supposed to clear the data.
>>>
>>> struct malloc_elem *
>>> malloc_elem_free(struct malloc_elem *elem)
>>> {
>>> void *ptr;
>>> size_t data_len;
>>>
>>> ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad);
>>> data_len = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD;
>>>
>>> elem = malloc_elem_join_adjacent_free(elem);
>>>
>>> malloc_elem_free_list_insert(elem);
>>>
>>> elem->pad = 0;
>>>
>>> /* decrease heap's count of allocated elements */
>>> elem->heap->alloc_count--;
>>>
>>> memset(ptr, 0, data_len);
>>>
>>> Maybe data_len is not correct either because of bug, or your
>>> application clobbered
>>> the malloc reserved regions in the element.
>>>
>>> More likely, gcc is incorrectly optimizing this away.
>>>
>>> https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations
>>>
>>> https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/
>>>
>>>
>>
>> I tend to be very wary of blaming the compiler without exhausting any
>> other possibilities :) It used to work before without issues, so
>> presumably whatever is happening, our memset works correctly.
>>
>> Andrew, you write:
>>
>> <snip>
>> ALLOC 0x7fffa3264080-0x7fffa32640b8
>> Not zerod address is 16 bytes before:
>> <snip>
>>
>> Of course the memory *before* your pointer would not be zero - it is
>> preceded by a 64-byte malloc header, so what you're seeing is the
>> malloc header data (which doesn't go away if you free it - it will go
>> away only if it is merged with an adjacent free malloc element). So,
>> i'm failing to see which problem you're describing, given that all
>> memory regions that are supposedly not free lie outside of your
>> malloc-allocated memory.
I tried to highlight that non-zeroed bytes belong to malloc header of
the previously allocated memory region. Later it becomes memory
allocated region itself (significantly bigger, so merges happened):
>>>
then it is allocated as the part of bigger memory chunk
ALLOC 0x7fffa3245b80-0x7fffa3265fd8
which should contain zeros, but above values are still the same.
<<<
>> However, after careful analysis, i can see that there is one
>> possibility where memory is not zeroed on free - if the original
>> malloc element was padded, and there aren't any more adjacent free
>> elements, then newly allocated memory may contain old pad header.
>> I'll submit a patch for you to try shortly.
>>
>
> Patch:
>
> http://patches.dpdk.org/patch/43196/
Yes, the patch fixes the problem I've observed. At least it passes
simple test which I used for debugging.
I'll run more automated tests tonight.
Many thanks,
Andrew.
More information about the dev
mailing list