[dpdk-dev] [PATCH v6 02/21] mem: allow memseg lists to be marked as external
Yongseok Koh
yskoh at mellanox.com
Sat Sep 29 02:09:46 CEST 2018
On Thu, Sep 27, 2018 at 11:40:59AM +0100, Anatoly Burakov wrote:
> When we allocate and use DPDK memory, we need to be able to
> differentiate between DPDK hugepage segments and segments that
> were made part of DPDK but are externally allocated. Add such
> a property to memseg lists.
>
> This breaks the ABI, so bump the EAL library ABI version and
> document the change in release notes. This also breaks a few
> internal assumptions about memory contiguousness, so adjust
> malloc code in a few places.
>
> All current calls for memseg walk functions were adjusted to
> ignore external segments where it made sense.
>
> Mempools is a special case, because we may be asked to allocate
> a mempool on a specific socket, and we need to ignore all page
> sizes on other heaps or other sockets. Previously, this
> assumption of knowing all page sizes was not a problem, but it
> will be now, so we have to match socket ID with page size when
> calculating minimum page size for a mempool.
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> Acked-by: Andrew Rybchenko <arybchenko at solarflare.com>
> ---
>
> Notes:
> v3:
> - Add comment to explain the process of picking up minimum
> page sizes for mempool
>
> v2:
> - Add documentation changes and ABI break
>
> v1:
> - Adjust all calls to memseg walk functions to ignore external
> segments where it made sense to do so
>
> doc/guides/rel_notes/deprecation.rst | 15 --------
> doc/guides/rel_notes/release_18_11.rst | 13 ++++++-
> drivers/bus/fslmc/fslmc_vfio.c | 7 ++--
> drivers/net/mlx4/mlx4_mr.c | 3 ++
> drivers/net/mlx5/mlx5.c | 5 ++-
> drivers/net/mlx5/mlx5_mr.c | 3 ++
> drivers/net/virtio/virtio_user/vhost_kernel.c | 5 ++-
> lib/librte_eal/bsdapp/eal/Makefile | 2 +-
> lib/librte_eal/bsdapp/eal/eal.c | 3 ++
> lib/librte_eal/bsdapp/eal/eal_memory.c | 7 ++--
> lib/librte_eal/common/eal_common_memory.c | 3 ++
> .../common/include/rte_eal_memconfig.h | 1 +
> lib/librte_eal/common/include/rte_memory.h | 9 +++++
> lib/librte_eal/common/malloc_elem.c | 10 ++++--
> lib/librte_eal/common/malloc_heap.c | 9 +++--
> lib/librte_eal/common/rte_malloc.c | 2 +-
> lib/librte_eal/linuxapp/eal/Makefile | 2 +-
> lib/librte_eal/linuxapp/eal/eal.c | 10 +++++-
> lib/librte_eal/linuxapp/eal/eal_memalloc.c | 9 +++++
> lib/librte_eal/linuxapp/eal/eal_vfio.c | 17 ++++++---
> lib/librte_eal/meson.build | 2 +-
> lib/librte_mempool/rte_mempool.c | 35 ++++++++++++++-----
> test/test/test_malloc.c | 3 ++
> test/test/test_memzone.c | 3 ++
> 24 files changed, 134 insertions(+), 44 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 138335dfb..d2aec64d1 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -11,21 +11,6 @@ API and ABI deprecation notices are to be posted here.
> Deprecation Notices
> -------------------
>
> -* eal: certain structures will change in EAL on account of upcoming external
> - memory support. Aside from internal changes leading to an ABI break, the
> - following externally visible changes will also be implemented:
> -
> - - ``rte_memseg_list`` will change to include a boolean flag indicating
> - whether a particular memseg list is externally allocated. This will have
> - implications for any users of memseg-walk-related functions, as they will
> - now have to skip externally allocated segments in most cases if the intent
> - is to only iterate over internal DPDK memory.
> - - ``socket_id`` parameter across the entire DPDK will gain additional meaning,
> - as some socket ID's will now be representing externally allocated memory. No
> - changes will be required for existing code as backwards compatibility will
> - be kept, and those who do not use this feature will not see these extra
> - socket ID's.
> -
> * eal: both declaring and identifying devices will be streamlined in v18.11.
> New functions will appear to query a specific port from buses, classes of
> device and device drivers. Device declaration will be made coherent with the
> diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
> index bc9b74ec4..5fc71e208 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -91,6 +91,13 @@ API Changes
> flag the MAC can be properly configured in any case. This is particularly
> important for bonding.
>
> +* eal: The following API changes were made in 18.11:
> +
> + - ``rte_memseg_list`` structure now has an additional flag indicating whether
> + the memseg list is externally allocated. This will have implications for any
> + users of memseg-walk-related functions, as they will now have to skip
> + externally allocated segments in most cases if the intent is to only iterate
> + over internal DPDK memory.
>
> ABI Changes
> -----------
> @@ -107,6 +114,10 @@ ABI Changes
> =========================================================
>
>
> +* eal: EAL library ABI version was changed due to previously announced work on
> + supporting external memory in DPDK. Structure ``rte_memseg_list`` now has
> + a new flag indicating whether the memseg list refers to external memory.
> +
> Removed Items
> -------------
>
> @@ -152,7 +163,7 @@ The libraries prepended with a plus sign were incremented in this version.
> librte_compressdev.so.1
> librte_cryptodev.so.5
> librte_distributor.so.1
> - librte_eal.so.8
> + + librte_eal.so.9
> librte_ethdev.so.10
> librte_eventdev.so.4
> librte_flow_classify.so.1
> diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c
> index 4c2cd2a87..2e9244fb7 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.c
> +++ b/drivers/bus/fslmc/fslmc_vfio.c
> @@ -317,12 +317,15 @@ fslmc_unmap_dma(uint64_t vaddr, uint64_t iovaddr __rte_unused, size_t len)
> }
>
> static int
> -fslmc_dmamap_seg(const struct rte_memseg_list *msl __rte_unused,
> - const struct rte_memseg *ms, void *arg)
> +fslmc_dmamap_seg(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
> + void *arg)
> {
> int *n_segs = arg;
> int ret;
>
> + if (msl->external)
> + return 0;
> +
> ret = fslmc_map_dma(ms->addr_64, ms->iova, ms->len);
> if (ret)
> DPAA2_BUS_ERR("Unable to VFIO map (addr=%p, len=%zu)",
> diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
> index d23d3c613..9f5d790b6 100644
> --- a/drivers/net/mlx4/mlx4_mr.c
> +++ b/drivers/net/mlx4/mlx4_mr.c
> @@ -496,6 +496,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
> {
> struct mr_find_contig_memsegs_data *data = arg;
>
> + if (msl->external)
> + return 0;
> +
Because memory free event for external memory is available, current design of
mlx4/mlx5 memory mgmt can accommodate the new external memory support. So,
please remove it so that PMD can traverse external memory as well.
> if (data->addr < ms->addr_64 || data->addr >= ms->addr_64 + len)
> return 0;
> /* Found, save it and stop walking. */
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 30d4e70a7..c90e1d8ce 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -568,11 +568,14 @@ static struct rte_pci_driver mlx5_driver;
> static void *uar_base;
>
> static int
> -find_lower_va_bound(const struct rte_memseg_list *msl __rte_unused,
> +find_lower_va_bound(const struct rte_memseg_list *msl,
> const struct rte_memseg *ms, void *arg)
> {
> void **addr = arg;
>
> + if (msl->external)
> + return 0;
> +
This one is fine.
But can you please remove the blank line?
That's a rule by former maintainers. :-)
> if (*addr == NULL)
> *addr = ms->addr;
> else
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> index 1d1bcb5fe..fd4345f9c 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -486,6 +486,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
> {
> struct mr_find_contig_memsegs_data *data = arg;
>
> + if (msl->external)
> + return 0;
> +
Like I mentioned, please remove it.
If those two changes in mlx4/5_mr.c are removed, for the whole patch,
Acked-by: Yongseok Koh <yskoh at mellanox.com>
Thanks
Yongseok
More information about the dev
mailing list