[dpdk-dev] [PATCH v3 00/68] Memory Hotplug for DPDK

santosh santosh.shukla at caviumnetworks.com
Thu Apr 5 20:59:57 CEST 2018


Hi Anatoly,

On Wednesday 04 April 2018 04:51 AM, Anatoly Burakov wrote:
> This patchset introduces dynamic memory allocation for DPDK (aka memory
> hotplug). Based upon RFC submitted in December [1].
>
> Dependencies (to be applied in specified order):
> - IPC asynchronous request API patch [2]
> - Function to return number of sockets [3]
> - EAL IOVA fix [4]
>
> Deprecation notices relevant to this patchset:
> - General outline of memory hotplug changes [5]
> - EAL NUMA node count changes [6]
>
> The vast majority of changes are in the EAL and malloc, the external API
> disruption is minimal: a new set of API's are added for contiguous memory
> allocation for rte_memzone, and a few API additions in rte_memory due to
> switch to memseg_lists as opposed to memsegs. Every other API change is
> internal to EAL, and all of the memory allocation/freeing is handled
> through rte_malloc, with no externally visible API changes.
>
> Quick outline of all changes done as part of this patchset:
>
>  * Malloc heap adjusted to handle holes in address space
>  * Single memseg list replaced by multiple memseg lists
>  * VA space for hugepages is preallocated in advance
>  * Added alloc/free for pages happening as needed on rte_malloc/rte_free
>  * Added contiguous memory allocation API's for rte_memzone
>  * Added convenience API calls to walk over memsegs
>  * Integrated Pawel Wodkowski's patch for registering/unregistering memory
>    with VFIO [7]
>  * Callbacks for registering memory allocations
>  * Callbacks for allowing/disallowing allocations above specified limit
>  * Multiprocess support done via DPDK IPC introduced in 18.02
>
> The biggest difference is a "memseg" now represents a single page (as opposed to
> being a big contiguous block of pages). As a consequence, both memzones and
> malloc elements are no longer guaranteed to be physically contiguous, unless
> the user asks for it at reserve time. To preserve whatever functionality that
> was dependent on previous behavior, a legacy memory option is also provided,
> however it is expected (or perhaps vainly hoped) to be temporary solution.
>
> Why multiple memseg lists instead of one? Since memseg is a single page now,
> the list of memsegs will get quite big, and we need to locate pages somehow
> when we allocate and free them. We could of course just walk the list and
> allocate one contiguous chunk of VA space for memsegs, but this
> implementation uses separate lists instead in order to speed up many
> operations with memseg lists.
>
> For v3, the following limitations are present:
> - VFIO support is only smoke-tested (but is expected to work), VFIO support
>   with secondary processes is not tested; work is ongoing to validate VFIO
>   for all use cases
> - FSLMC bus VFIO code is not yet integrated, work is in progress
>
> For testing, it is recommended to use the GitHub repository [8], as it will
> have all of the dependencies already integrated.
>
> v3:
>     - Lots of compile fixes
>     - Fixes for multiprocess synchronization
>     - Introduced support for sPAPR IOMMU, courtesy of Gowrishankar @ IBM
>     - Fixes for mempool size calculation
>     - Added convenience memseg walk() API's
>     - Added alloc validation callback
>
> v2: - fixed deadlock at init
>     - reverted rte_panic changes at init, this is now handled inside IPC

Tested-by: Santosh Shukla <Santosh.Shukla at caviumnetworks.com>



More information about the dev mailing list