[dpdk-dev] [PATCH 00/16] Support externally allocated memory in DPDK

Shahaf Shuler shahafs at mellanox.com
Thu Sep 13 09:44:15 CEST 2018


Hi Anatoly,

First thanks for the patchset, it is a great enhancement. 

See question below. 

Tuesday, September 4, 2018 4:12 PM, Anatoly Burakov:
> Subject: [dpdk-dev] [PATCH 00/16] Support externally allocated memory in
> DPDK
> 
> This is a proposal to enable using externally allocated memory in DPDK.
> 
> In a nutshell, here is what is being done here:
> 
> - Index internal malloc heaps by NUMA node index, rather than NUMA
>   node itself (external heaps will have ID's in order of creation)
> - Add identifier string to malloc heap, to uniquely identify it
>   - Each new heap will receive a unique socket ID that will be used by
>     allocator to decide from which heap (internal or external) to
>     allocate requested amount of memory
> - Allow creating named heaps and add/remove memory to/from those
> heaps
> - Allocate memseg lists at runtime, to keep track of IOVA addresses
>   of externally allocated memory
>   - If IOVA addresses aren't provided, use RTE_BAD_IOVA
> - Allow malloc and memzones to allocate from external heaps
> - Allow other data structures to allocate from externall heaps
> 
> The responsibility to ensure memory is accessible before using it is on the
> shoulders of the user - there is no checking done with regards to validity of
> the memory (nor could there be...).

That makes sense. However who should be in-charge of mapping this memory for dma access?
The user or internally be the PMD when encounter the first packet or while traversing the existing mempools? 

> 
> The general approach is to create heap and add memory into it. For any other
> process wishing to use the same memory, said memory must first be
> attached (otherwise some things will not work).
> 
> A design decision was made to make multiprocess synchronization a manual
> process. Due to underlying issues with attaching to fbarrays in secondary
> processes, this design was deemed to be better because we don't want to
> fail to create external heap in the primary because something in the
> secondary has failed when in fact we may not eve have wanted this memory
> to be accessible in the secondary in the first place.
> 
> Using external memory in multiprocess is *hard*, because not only memory
> space needs to be preallocated, but it also needs to be attached in each
> process to allow other processes to access the page table. The attach API call
> may or may not succeed, depending on memory layout, for reasons similar to
> other multiprocess failures. This is treated as a "known issue" for this release.
> 
> RFC -> v1 changes:
> - Removed the "named heaps" API, allocate using fake socket ID instead
> - Added multiprocess support
> - Everything is now thread-safe
> - Numerous bugfixes and API improvements
> 
> Anatoly Burakov (16):
>   mem: add length to memseg list
>   mem: allow memseg lists to be marked as external
>   malloc: index heaps using heap ID rather than NUMA node
>   mem: do not check for invalid socket ID
>   flow_classify: do not check for invalid socket ID
>   pipeline: do not check for invalid socket ID
>   sched: do not check for invalid socket ID
>   malloc: add name to malloc heaps
>   malloc: add function to query socket ID of named heap
>   malloc: allow creating malloc heaps
>   malloc: allow destroying heaps
>   malloc: allow adding memory to named heaps
>   malloc: allow removing memory from named heaps
>   malloc: allow attaching to external memory chunks
>   malloc: allow detaching from external memory
>   test: add unit tests for external memory support
> 
>  config/common_base                            |   1 +
>  config/rte_config.h                           |   1 +
>  drivers/bus/fslmc/fslmc_vfio.c                |   7 +-
>  drivers/bus/pci/linux/pci.c                   |   2 +-
>  drivers/net/mlx4/mlx4_mr.c                    |   3 +
>  drivers/net/mlx5/mlx5.c                       |   5 +-
>  drivers/net/mlx5/mlx5_mr.c                    |   3 +
>  drivers/net/virtio/virtio_user/vhost_kernel.c |   5 +-
>  lib/librte_eal/bsdapp/eal/eal.c               |   3 +
>  lib/librte_eal/bsdapp/eal/eal_memory.c        |   9 +-
>  lib/librte_eal/common/eal_common_memory.c     |   9 +-
>  lib/librte_eal/common/eal_common_memzone.c    |   8 +-
>  .../common/include/rte_eal_memconfig.h        |   6 +-
>  lib/librte_eal/common/include/rte_malloc.h    | 181 +++++++++
>  .../common/include/rte_malloc_heap.h          |   3 +
>  lib/librte_eal/common/include/rte_memory.h    |   9 +
>  lib/librte_eal/common/malloc_heap.c           | 287 +++++++++++--
>  lib/librte_eal/common/malloc_heap.h           |  17 +
>  lib/librte_eal/common/rte_malloc.c            | 383 ++++++++++++++++-
>  lib/librte_eal/linuxapp/eal/eal.c             |   3 +
>  lib/librte_eal/linuxapp/eal/eal_memalloc.c    |  12 +-
>  lib/librte_eal/linuxapp/eal/eal_memory.c      |   4 +-
>  lib/librte_eal/linuxapp/eal/eal_vfio.c        |  17 +-
>  lib/librte_eal/rte_eal_version.map            |   7 +
>  lib/librte_flow_classify/rte_flow_classify.c  |   3 +-
>  lib/librte_mempool/rte_mempool.c              |  31 +-
>  lib/librte_pipeline/rte_pipeline.c            |   3 +-
>  lib/librte_sched/rte_sched.c                  |   2 +-
>  test/test/Makefile                            |   1 +
>  test/test/autotest_data.py                    |  14 +-
>  test/test/meson.build                         |   1 +
>  test/test/test_external_mem.c                 | 384 ++++++++++++++++++
>  test/test/test_malloc.c                       |   3 +
>  test/test/test_memzone.c                      |   3 +
>  34 files changed, 1346 insertions(+), 84 deletions(-)  create mode 100644
> test/test/test_external_mem.c
> 
> --
> 2.17.1


More information about the dev mailing list