[dpdk-dev] [PATCH v6 02/21] mem: allow memseg lists to be marked as external

Yongseok Koh yskoh at mellanox.com
Sat Sep 29 02:09:46 CEST 2018


On Thu, Sep 27, 2018 at 11:40:59AM +0100, Anatoly Burakov wrote:
> When we allocate and use DPDK memory, we need to be able to
> differentiate between DPDK hugepage segments and segments that
> were made part of DPDK but are externally allocated. Add such
> a property to memseg lists.
> 
> This breaks the ABI, so bump the EAL library ABI version and
> document the change in release notes. This also breaks a few
> internal assumptions about memory contiguousness, so adjust
> malloc code in a few places.
> 
> All current calls for memseg walk functions were adjusted to
> ignore external segments where it made sense.
> 
> Mempools is a special case, because we may be asked to allocate
> a mempool on a specific socket, and we need to ignore all page
> sizes on other heaps or other sockets. Previously, this
> assumption of knowing all page sizes was not a problem, but it
> will be now, so we have to match socket ID with page size when
> calculating minimum page size for a mempool.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> Acked-by: Andrew Rybchenko <arybchenko at solarflare.com>
> ---
> 
> Notes:
>     v3:
>     - Add comment to explain the process of picking up minimum
>       page sizes for mempool
>     
>     v2:
>     - Add documentation changes and ABI break
>     
>     v1:
>     - Adjust all calls to memseg walk functions to ignore external
>       segments where it made sense to do so
> 
>  doc/guides/rel_notes/deprecation.rst          | 15 --------
>  doc/guides/rel_notes/release_18_11.rst        | 13 ++++++-
>  drivers/bus/fslmc/fslmc_vfio.c                |  7 ++--
>  drivers/net/mlx4/mlx4_mr.c                    |  3 ++
>  drivers/net/mlx5/mlx5.c                       |  5 ++-
>  drivers/net/mlx5/mlx5_mr.c                    |  3 ++
>  drivers/net/virtio/virtio_user/vhost_kernel.c |  5 ++-
>  lib/librte_eal/bsdapp/eal/Makefile            |  2 +-
>  lib/librte_eal/bsdapp/eal/eal.c               |  3 ++
>  lib/librte_eal/bsdapp/eal/eal_memory.c        |  7 ++--
>  lib/librte_eal/common/eal_common_memory.c     |  3 ++
>  .../common/include/rte_eal_memconfig.h        |  1 +
>  lib/librte_eal/common/include/rte_memory.h    |  9 +++++
>  lib/librte_eal/common/malloc_elem.c           | 10 ++++--
>  lib/librte_eal/common/malloc_heap.c           |  9 +++--
>  lib/librte_eal/common/rte_malloc.c            |  2 +-
>  lib/librte_eal/linuxapp/eal/Makefile          |  2 +-
>  lib/librte_eal/linuxapp/eal/eal.c             | 10 +++++-
>  lib/librte_eal/linuxapp/eal/eal_memalloc.c    |  9 +++++
>  lib/librte_eal/linuxapp/eal/eal_vfio.c        | 17 ++++++---
>  lib/librte_eal/meson.build                    |  2 +-
>  lib/librte_mempool/rte_mempool.c              | 35 ++++++++++++++-----
>  test/test/test_malloc.c                       |  3 ++
>  test/test/test_memzone.c                      |  3 ++
>  24 files changed, 134 insertions(+), 44 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 138335dfb..d2aec64d1 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -11,21 +11,6 @@ API and ABI deprecation notices are to be posted here.
>  Deprecation Notices
>  -------------------
>  
> -* eal: certain structures will change in EAL on account of upcoming external
> -  memory support. Aside from internal changes leading to an ABI break, the
> -  following externally visible changes will also be implemented:
> -
> -  - ``rte_memseg_list`` will change to include a boolean flag indicating
> -    whether a particular memseg list is externally allocated. This will have
> -    implications for any users of memseg-walk-related functions, as they will
> -    now have to skip externally allocated segments in most cases if the intent
> -    is to only iterate over internal DPDK memory.
> -  - ``socket_id`` parameter across the entire DPDK will gain additional meaning,
> -    as some socket ID's will now be representing externally allocated memory. No
> -    changes will be required for existing code as backwards compatibility will
> -    be kept, and those who do not use this feature will not see these extra
> -    socket ID's.
> -
>  * eal: both declaring and identifying devices will be streamlined in v18.11.
>    New functions will appear to query a specific port from buses, classes of
>    device and device drivers. Device declaration will be made coherent with the
> diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
> index bc9b74ec4..5fc71e208 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -91,6 +91,13 @@ API Changes
>    flag the MAC can be properly configured in any case. This is particularly
>    important for bonding.
>  
> +* eal: The following API changes were made in 18.11:
> +
> +  - ``rte_memseg_list`` structure now has an additional flag indicating whether
> +    the memseg list is externally allocated. This will have implications for any
> +    users of memseg-walk-related functions, as they will now have to skip
> +    externally allocated segments in most cases if the intent is to only iterate
> +    over internal DPDK memory.
>  
>  ABI Changes
>  -----------
> @@ -107,6 +114,10 @@ ABI Changes
>     =========================================================
>  
>  
> +* eal: EAL library ABI version was changed due to previously announced work on
> +       supporting external memory in DPDK. Structure ``rte_memseg_list`` now has
> +       a new flag indicating whether the memseg list refers to external memory.
> +
>  Removed Items
>  -------------
>  
> @@ -152,7 +163,7 @@ The libraries prepended with a plus sign were incremented in this version.
>       librte_compressdev.so.1
>       librte_cryptodev.so.5
>       librte_distributor.so.1
> -     librte_eal.so.8
> +   + librte_eal.so.9
>       librte_ethdev.so.10
>       librte_eventdev.so.4
>       librte_flow_classify.so.1
> diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c
> index 4c2cd2a87..2e9244fb7 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.c
> +++ b/drivers/bus/fslmc/fslmc_vfio.c
> @@ -317,12 +317,15 @@ fslmc_unmap_dma(uint64_t vaddr, uint64_t iovaddr __rte_unused, size_t len)
>  }
>  
>  static int
> -fslmc_dmamap_seg(const struct rte_memseg_list *msl __rte_unused,
> -		 const struct rte_memseg *ms, void *arg)
> +fslmc_dmamap_seg(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
> +		void *arg)
>  {
>  	int *n_segs = arg;
>  	int ret;
>  
> +	if (msl->external)
> +		return 0;
> +
>  	ret = fslmc_map_dma(ms->addr_64, ms->iova, ms->len);
>  	if (ret)
>  		DPAA2_BUS_ERR("Unable to VFIO map (addr=%p, len=%zu)",
> diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
> index d23d3c613..9f5d790b6 100644
> --- a/drivers/net/mlx4/mlx4_mr.c
> +++ b/drivers/net/mlx4/mlx4_mr.c
> @@ -496,6 +496,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
>  {
>  	struct mr_find_contig_memsegs_data *data = arg;
>  
> +	if (msl->external)
> +		return 0;
> +

Because memory free event for external memory is available, current design of
mlx4/mlx5 memory mgmt can accommodate the new external memory support. So,
please remove it so that PMD can traverse external memory as well.

>  	if (data->addr < ms->addr_64 || data->addr >= ms->addr_64 + len)
>  		return 0;
>  	/* Found, save it and stop walking. */
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 30d4e70a7..c90e1d8ce 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -568,11 +568,14 @@ static struct rte_pci_driver mlx5_driver;
>  static void *uar_base;
>  
>  static int
> -find_lower_va_bound(const struct rte_memseg_list *msl __rte_unused,
> +find_lower_va_bound(const struct rte_memseg_list *msl,
>  		const struct rte_memseg *ms, void *arg)
>  {
>  	void **addr = arg;
>  
> +	if (msl->external)
> +		return 0;
> +

This one is fine.
But can you please remove the blank line?
That's a rule by former maintainers. :-)

>  	if (*addr == NULL)
>  		*addr = ms->addr;
>  	else
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> index 1d1bcb5fe..fd4345f9c 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -486,6 +486,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
>  {
>  	struct mr_find_contig_memsegs_data *data = arg;
>  
> +	if (msl->external)
> +		return 0;
> +

Like I mentioned, please remove it.

If those two changes in mlx4/5_mr.c are removed, for the whole patch,

Acked-by: Yongseok Koh <yskoh at mellanox.com>

Thanks
Yongseok


More information about the dev mailing list