[dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

Walker, Benjamin benjamin.walker at intel.com
Thu Dec 21 22:38:30 CET 2017


On Tue, 2017-12-19 at 11:14 +0000, Anatoly Burakov wrote:
> 

> Quick outline of all changes done as part of this patchset:
> 
>  * Malloc heap adjusted to handle holes in address space
>  * Single memseg list replaced by multiple expandable memseg lists
>  * VA space for hugepages is preallocated in advance
>  * Added dynamic alloc/free for pages, happening as needed on malloc/free

SPDK will need some way to register for a notification when pages are allocated
or freed. For storage, the number of requests per second is (relative to
networking) fairly small (hundreds of thousands per second in a traditional
block storage stack, or a few million per second with SPDK). Given that, we can
afford to do a dynamic lookup from va to pa/iova on each request in order to
greatly simplify our APIs (users can just pass pointers around instead of
mbufs). DPDK has a way to lookup the pa from a given va, but it does so by
scanning /proc/self/pagemap and is very slow. SPDK instead handles this by
implementing a lookup table of va to pa/iova which we populate by scanning
through the DPDK memory segments at start up, so the lookup in our table is
sufficiently fast for storage use cases. If the list of memory segments changes,
we need to know about it in order to update our map.

Having the map also enables a number of other nice things - for instance we
allow users to register memory that wasn't allocated through DPDK and use it for
DMA operations. We keep that va to pa/iova mapping in the same map. I appreciate
you adding APIs to dynamically register this type of memory with the IOMMU on
our behalf. That allows us to eliminate a nasty hack where we were looking up
the vfio file descriptor through sysfs in order to send the registration ioctl.

>  * Added contiguous memory allocation API's for rte_malloc and rte_memzone
>  * Integrated Pawel Wodkowski's patch [1] for registering/unregistering memory
>    with VFIO
> 


More information about the dev mailing list