[dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK

Burakov, Anatoly anatoly.burakov at intel.com
Thu Mar 22 10:24:34 CET 2018


On 22-Mar-18 5:09 AM, Shreyansh Jain wrote:
> Hello Anatoly,
> 
>> -----Original Message-----
>> From: Burakov, Anatoly [mailto:anatoly.burakov at intel.com]
>> Sent: Wednesday, March 21, 2018 8:18 PM
>> To: Shreyansh Jain <shreyansh.jain at nxp.com>
>> Cc: dev at dpdk.org; Hemant Agrawal <hemant.agrawal at nxp.com>
>> Subject: Re: [dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK
>>
> 
> [...]
> 
>>>>
>>>
>>> While working on issue reported in [1], I have found another issue
>>> which I might need you help.
>>>
>>> [1]
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpdk.o
>> rg%2Fml%2Farchives%2Fdev%2F2018-
>> March%2F093202.html&data=02%7C01%7Cshreyansh.jain%40nxp.com%7C5faee716e6
>> fc4908bdb608d58f3ad1e5%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6365
>> 72405182868376&sdata=WohDdktHHAuNDnss1atuixSa%2FqC7HRMSDVCtFC9Vnto%3D&re
>> served=0
>>>
>>> For [1], I bypassed by changing the mempool_add_elem code for time
>>> being - it now allows non-contiguous (not explicitly demanded
>>> contiguous) allocations to go through rte_mempool_populate_iova. With
>>> that, I was able to get DPAA2 working.
>>>
>>> Problem is:
>>> 1. When I am working with 1GB pages, I/O is working fine.
>>> 2. When using 2MB pages (1024 num), the initialization somewhere after
>>> VFIO layer fails.
>>>
>>> All with IOVA=VA mode.
>>>
>>> Some logs:
>>>
>>> This is the output of the virtual memory layout demanded by DPDK:
>>>
>>> --->8---
>>> EAL: Ask a virtual area of 0x2e000 bytes
>>> EAL: Virtual area found at 0xffffb6561000 (size = 0x2e000)
>>> EAL: Setting up physically contiguous memory...
>>> EAL: Ask a virtual area of 0x59000 bytes
>>> EAL: Virtual area found at 0xffffb6508000 (size = 0x59000)
>>> EAL: Memseg list allocated: 0x800kB at socket 0
>>> EAL: Ask a virtual area of 0x400000000 bytes
>>> EAL: Virtual area found at 0xfffbb6400000 (size = 0x400000000)
>>> EAL: Ask a virtual area of 0x59000 bytes
>>> EAL: Virtual area found at 0xfffbb62af000 (size = 0x59000)
>>> EAL: Memseg list allocated: 0x800kB at socket 0
>>> EAL: Ask a virtual area of 0x400000000 bytes
>>> EAL: Virtual area found at 0xfff7b6200000 (size = 0x400000000)
>>> EAL: Ask a virtual area of 0x59000 bytes
>>> EAL: Virtual area found at 0xfff7b6056000 (size = 0x59000)
>>> EAL: Memseg list allocated: 0x800kB at socket 0
>>> EAL: Ask a virtual area of 0x400000000 bytes
>>> EAL: Virtual area found at 0xfff3b6000000 (size = 0x400000000)
>>> EAL: Ask a virtual area of 0x59000 bytes
>>> EAL: Virtual area found at 0xfff3b5dfd000 (size = 0x59000)
>>> EAL: Memseg list allocated: 0x800kB at socket 0
>>> EAL: Ask a virtual area of 0x400000000 bytes
>>> EAL: Virtual area found at 0xffefb5c00000 (size = 0x400000000)
>>> --->8---
>>>
>>> Then, somehow VFIO mapping is able to find only a single page to map
>>>
>>> --->8---
>>> EAL: Device (dpci.1) abstracted from VFIO
>>> EAL: -->Initial SHM Virtual ADDR FFFBB6400000
>>> EAL: -----> DMA size 0x200000
>>> EAL: Total 1 segments found.
>>> --->8---
>>>
>>> Then, these logs appear probably when DPAA2 code requests for memory.
>>> I am not sure why it repeats the same '...expanded by 10MB'.
>>>
>>> --->8---
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 2MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request:
>> mp_malloc_sync
>>> EAL: Heap on socket 0 was expanded by 10MB
>>> LPM or EM none selected, default LPM on
>>> Initializing port 0 ...
>>> --->8---
>>>
>>> l3fwd is stuck at this point. What I observe is that DPAA2 driver has
>>> gone ahead to register the queues (queue_setup) with hardware and the
>>> memory has either overrun (smaller than requested size mapped) or the
>>> addresses are corrupt (that is, not dma-able). (I get SMMU faults,
>>> indicating one of these cases)
>>>
>>> There is some change from you in the fslmc/fslmc_vfio.c file
>>> (rte_fslmc_vfio_dmamap()). Ideally, that code should have walked over
>>> all the available pages for mapping but that didn't happen and only a
>>> single virtual area got dma-mapped.
>>>
>>> --->8---
>>> EAL: Device (dpci.1) abstracted from VFIO
>>> EAL: -->Initial SHM Virtual ADDR FFFBB6400000
>>> EAL: -----> DMA size 0x200000
>>> EAL: Total 1 segments found.
>>> --->8---
>>>
>>> I am looking into this but if there is some hint which come to your
>>> mind, it might help.
>>>
>>> Regards,
>>> Shreyansh
>>>
>>
>> Hi Shreyansh,
>>
>> Thanks for the feedback.
>>
>> The "heap on socket 0 was expanded by 10MB" has to do with
>> synchronization requests in primary/secondary processes. I can see
>> you're allocating LPM tables - that's most likely what these allocations
>> are about (it's hotplugging memory).
> 
> I get that but why same message multiple times without any change in the expansion. Further, I don't have multiple process - in fact, I'm working with a single datapath thread.
> Anyways, I will look through the code for this.
> 

Hi Shreyansh,

I've misspoke - this has nothing to do with multiprocess. The "request: 
mp_malloc_sync" does, but it's an attempt to notify other processes of 
the allocation - if there are no processes, nothing happens.

However, multiple heap expansions do correspond to multiple allocations. 
If you allocate an LPM table that takes up 10M of hugepage memory - you 
expand heap by 10M. If you do it multiple times (e.g. per-NIC?), you do 
multiple heap expansions. This message will be triggered on every heap 
expansion.

>>
>> I think i might have an idea what is going on. I am assuming that you
>> are starting up your DPDK application without any -m or --socket-mem
>> flags, which means you are starting with empty heap.
> 
> Yes, no specific --socket-mem passed as argument.
> 
>>
>> During initialization, certain DPDK features (such as service cores,
>> PMD's) allocate memory. Most likely you have essentially started up with
>> 1 2M page, which is what you see in fslmc logs: this page gets mapped
>> for VFIO.
> 
> Agree.
> 
>>
>> Then, you allocate a bunch of LPM tables, which trigger more memory
>> allocation, and trigger memory allocation callbacks registered through
>> rte_mem_event_register_callback(). One of these callbacks is a VFIO
>> callback, which is registered in eal_vfio.c:rte_vfio_enable(). However,
>> since fslmc bus has its own VFIO implementation that is independent of
>> what happens in EAL VFIO code, what probably happens is that the fslmc
>> bus misses the necessary messages from the memory hotplug to map
>> additional resources for DMA.
> 
> Makes sense
> 
>>
>> Try adding a rte_mem_event_register_callback() somewhere in fslmc init
>> so that it calls necessary map function.
>> eal_vfio.c:vfio_mem_event_callback() should provide a good template on
>> how to approach creating such a callback. Let me know if this works!
> 
> OK. I will give this a try and update you.
> 
>>
>> (as a side note, how can we extend VFIO to move this stuff back into EAL
>> and expose it as an API?)
> 
> The problem is that FSLMC VFIO driver is slightly different from generic VFIO layer in the sense that device in a VFIO container is actually another level of container. Anyways, I will have a look how much generalization is possible. Or else, I will work with the vfio_mem_event_callback() as suggested above.

This can wait :) The callback is probably the proper way to do it right now.

> 
> Thanks for suggestions.
> 
>>
>> --
>> Thanks,
>> Anatoly


-- 
Thanks,
Anatoly


More information about the dev mailing list