[PATCH V3] lib: set/get max memzone segments
David Marchand
david.marchand at redhat.com
Thu May 4 09:27:12 CEST 2023
Hello Ophir,
Good thing someone is looking into this.
Thanks.
I have a few comments.
This commitlog is a bit compact.
Separating it with some empty lines would help digest it.
On Wed, May 3, 2023 at 9:27 AM Ophir Munk <ophirmu at nvidia.com> wrote:
>
> In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> coded as 2560.
> For applications requiring different values of this
> parameter – it is more convenient to set the max value via an rte API -
> rather than changing the dpdk source code per application. In many
> organizations, the possibility to compile a private DPDK library for a
> particular application does not exist at all. With this option there is
> no need to recompile DPDK and it allows using an in-box packaged DPDK.
> An example usage for updating the RTE_MAX_MEMZONE would be of an
> application that uses the DPDK mempool library which is based on DPDK
> memzone library. The application may need to create a number of
> steering tables, each of which will require its own mempool allocation.
> This commit is not about how to optimize the application usage of
> mempool nor about how to improve the mempool implementation based on
> memzone. It is about how to make the max memzone definition - run-time
> customized.
The code was using a rather short name "RTE_MAX_MEMZONE".
But I prefer we spell this as "max memzones count" (or a better
wording), in the descriptions/comments.
> This commit adds an API which must be called before rte_eal_init():
> rte_memzone_max_set(int max). If not called, the default memzone
Afaics, this prototype got unaligned with the patch content, as a
size_t is now taken as input.
You can simply mention rte_memzone_max_set().
> (RTE_MAX_MEMZONE) is used. There is also an API to query the effective
After the patch RTE_MAX_MEMZONE does not exist anymore.
> max memzone: rte_memzone_max_get().
>
> Signed-off-by: Ophir Munk <ophirmu at nvidia.com>
A global comment on the patch:
rte_calloc provides what you want in all cases below: an array of objects.
I prefer rte_calloc(count, sizeof elem) to rte_zmalloc(count * sizeof elem).
This also avoids a temporary variable to compute the total size of
such an array.
> ---
> app/test/test_func_reentrancy.c | 2 +-
> app/test/test_malloc_perf.c | 2 +-
> app/test/test_memzone.c | 43 ++++++++++++++++++++++++-------------
> config/rte_config.h | 1 -
> drivers/net/qede/base/bcm_osal.c | 30 ++++++++++++++++++++------
> drivers/net/qede/base/bcm_osal.h | 3 +++
> drivers/net/qede/qede_main.c | 7 ++++++
> lib/eal/common/eal_common_memzone.c | 28 +++++++++++++++++++++---
> lib/eal/include/rte_memzone.h | 20 +++++++++++++++++
> lib/eal/version.map | 4 ++++
> 10 files changed, 112 insertions(+), 28 deletions(-)
>
> diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
> index d1ed5d4..ae9de6f 100644
> --- a/app/test/test_func_reentrancy.c
> +++ b/app/test/test_func_reentrancy.c
> @@ -51,7 +51,7 @@ typedef void (*case_clean_t)(unsigned lcore_id);
> #define MEMPOOL_ELT_SIZE (sizeof(uint32_t))
> #define MEMPOOL_SIZE (4)
>
> -#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U))
> +#define MAX_LCORES (rte_memzone_max_get() / (MAX_ITER_MULTI * 4U))
>
> static uint32_t obj_count;
> static uint32_t synchro;
> diff --git a/app/test/test_malloc_perf.c b/app/test/test_malloc_perf.c
> index ccec43a..9bd1662 100644
> --- a/app/test/test_malloc_perf.c
> +++ b/app/test/test_malloc_perf.c
> @@ -165,7 +165,7 @@ test_malloc_perf(void)
> return -1;
>
> if (test_alloc_perf("rte_memzone_reserve", memzone_alloc, memzone_free,
> - NULL, memset_us_gb, RTE_MAX_MEMZONE - 1) < 0)
> + NULL, memset_us_gb, rte_memzone_max_get() - 1) < 0)
> return -1;
>
> return 0;
> diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
> index c9255e5..36b9790 100644
> --- a/app/test/test_memzone.c
> +++ b/app/test/test_memzone.c
> @@ -871,9 +871,18 @@ test_memzone_bounded(void)
> static int
> test_memzone_free(void)
> {
> - const struct rte_memzone *mz[RTE_MAX_MEMZONE + 1];
> + size_t mz_size;
> + const struct rte_memzone **mz;
> int i;
> char name[20];
> + int rc = -1;
> +
> + mz_size = (rte_memzone_max_get() + 1) * sizeof(struct rte_memzone *);
> + mz = rte_zmalloc("memzone_test", mz_size, 0);
> + if (!mz) {
> + printf("Fail allocating memzone test array\n");
> + return rc;
> + }
>
> mz[0] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone0"), 2000,
> SOCKET_ID_ANY, 0);
> @@ -881,42 +890,42 @@ test_memzone_free(void)
> SOCKET_ID_ANY, 0);
>
> if (mz[0] > mz[1])
> - return -1;
> + goto exit_test;
> if (!rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone0")))
> - return -1;
> + goto exit_test;
> if (!rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone1")))
> - return -1;
> + goto exit_test;
>
> if (rte_memzone_free(mz[0])) {
> printf("Fail memzone free - tempzone0\n");
> - return -1;
> + goto exit_test;
> }
> if (rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone0"))) {
> printf("Found previously free memzone - tempzone0\n");
> - return -1;
> + goto exit_test;
> }
> mz[2] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone2"), 2000,
> SOCKET_ID_ANY, 0);
>
> if (mz[2] > mz[1]) {
> printf("tempzone2 should have gotten the free entry from tempzone0\n");
> - return -1;
> + goto exit_test;
> }
> if (rte_memzone_free(mz[2])) {
> printf("Fail memzone free - tempzone2\n");
> - return -1;
> + goto exit_test;
> }
> if (rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone2"))) {
> printf("Found previously free memzone - tempzone2\n");
> - return -1;
> + goto exit_test;
> }
> if (rte_memzone_free(mz[1])) {
> printf("Fail memzone free - tempzone1\n");
> - return -1;
> + goto exit_test;
> }
> if (rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone1"))) {
> printf("Found previously free memzone - tempzone1\n");
> - return -1;
> + goto exit_test;
> }
>
> i = 0;
> @@ -928,7 +937,7 @@ test_memzone_free(void)
>
> if (rte_memzone_free(mz[0])) {
> printf("Fail memzone free - tempzone0\n");
> - return -1;
> + goto exit_test;
> }
> mz[0] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone0new"), 0,
> SOCKET_ID_ANY, 0);
> @@ -936,17 +945,21 @@ test_memzone_free(void)
> if (mz[0] == NULL) {
> printf("Fail to create memzone - tempzone0new - when MAX memzones were "
> "created and one was free\n");
> - return -1;
> + goto exit_test;
> }
>
> for (i = i - 2; i >= 0; i--) {
> if (rte_memzone_free(mz[i])) {
> printf("Fail memzone free - tempzone%d\n", i);
> - return -1;
> + goto exit_test;
> }
> }
>
> - return 0;
> + rc = 0;
> +
> +exit_test:
> + rte_free(mz);
> + return rc;
> }
>
> static int test_memzones_left;
> diff --git a/config/rte_config.h b/config/rte_config.h
> index 7b8c85e..400e44e 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -34,7 +34,6 @@
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> #define RTE_MAX_MEMSEG_PER_TYPE 32768
> #define RTE_MAX_MEM_MB_PER_TYPE 65536
> -#define RTE_MAX_MEMZONE 2560
> #define RTE_MAX_TAILQ 32
> #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> #define RTE_MAX_VFIO_CONTAINERS 64
> diff --git a/drivers/net/qede/base/bcm_osal.c b/drivers/net/qede/base/bcm_osal.c
> index 2c59397..0458768 100644
> --- a/drivers/net/qede/base/bcm_osal.c
> +++ b/drivers/net/qede/base/bcm_osal.c
> @@ -47,10 +47,26 @@ void osal_poll_mode_dpc(osal_int_ptr_t hwfn_cookie)
> }
>
> /* Array of memzone pointers */
> -static const struct rte_memzone *ecore_mz_mapping[RTE_MAX_MEMZONE];
> +static const struct rte_memzone **ecore_mz_mapping;
> /* Counter to track current memzone allocated */
> static uint16_t ecore_mz_count;
>
> +int ecore_mz_mapping_alloc(void)
> +{
> + ecore_mz_mapping = rte_zmalloc("ecore_mz_map",
> + rte_memzone_max_get() * sizeof(struct rte_memzone *), 0);
I think we should allocate only for the first call and we are missing
some refcount.
> +
> + if (!ecore_mz_mapping)
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +void ecore_mz_mapping_free(void)
> +{
> + rte_free(ecore_mz_mapping);
Won't we release this array while another qed device is still in use?
> +}
> +
> unsigned long qede_log2_align(unsigned long n)
> {
> unsigned long ret = n ? 1 : 0;
> @@ -132,9 +148,9 @@ void *osal_dma_alloc_coherent(struct ecore_dev *p_dev,
> uint32_t core_id = rte_lcore_id();
> unsigned int socket_id;
>
> - if (ecore_mz_count >= RTE_MAX_MEMZONE) {
> - DP_ERR(p_dev, "Memzone allocation count exceeds %u\n",
> - RTE_MAX_MEMZONE);
> + if (ecore_mz_count >= rte_memzone_max_get()) {
> + DP_ERR(p_dev, "Memzone allocation count exceeds %zu\n",
> + rte_memzone_max_get());
> *phys = 0;
> return OSAL_NULL;
> }
> @@ -171,9 +187,9 @@ void *osal_dma_alloc_coherent_aligned(struct ecore_dev *p_dev,
> uint32_t core_id = rte_lcore_id();
> unsigned int socket_id;
>
> - if (ecore_mz_count >= RTE_MAX_MEMZONE) {
> - DP_ERR(p_dev, "Memzone allocation count exceeds %u\n",
> - RTE_MAX_MEMZONE);
> + if (ecore_mz_count >= rte_memzone_max_get()) {
> + DP_ERR(p_dev, "Memzone allocation count exceeds %zu\n",
> + rte_memzone_max_get());
> *phys = 0;
> return OSAL_NULL;
> }
> diff --git a/drivers/net/qede/base/bcm_osal.h b/drivers/net/qede/base/bcm_osal.h
> index 67e7f75..97e261d 100644
> --- a/drivers/net/qede/base/bcm_osal.h
> +++ b/drivers/net/qede/base/bcm_osal.h
> @@ -477,4 +477,7 @@ enum dbg_status qed_dbg_alloc_user_data(struct ecore_hwfn *p_hwfn,
> qed_dbg_alloc_user_data(p_hwfn, user_data_ptr)
> #define OSAL_DB_REC_OCCURRED(p_hwfn) nothing
>
> +int ecore_mz_mapping_alloc(void);
> +void ecore_mz_mapping_free(void);
> +
> #endif /* __BCM_OSAL_H */
> diff --git a/drivers/net/qede/qede_main.c b/drivers/net/qede/qede_main.c
> index 0303903..fd63262 100644
> --- a/drivers/net/qede/qede_main.c
> +++ b/drivers/net/qede/qede_main.c
> @@ -72,6 +72,12 @@ qed_probe(struct ecore_dev *edev, struct rte_pci_device *pci_dev,
> hw_prepare_params.allow_mdump = false;
> hw_prepare_params.b_en_pacing = false;
> hw_prepare_params.epoch = OSAL_GET_EPOCH(ECORE_LEADING_HWFN(edev));
> + rc = ecore_mz_mapping_alloc();
> + if (rc) {
> + DP_ERR(edev, "mem zones array allocation failed\n");
> + return rc;
> + }
> +
> rc = ecore_hw_prepare(edev, &hw_prepare_params);
> if (rc) {
> DP_ERR(edev, "hw prepare failed\n");
> @@ -722,6 +728,7 @@ static void qed_remove(struct ecore_dev *edev)
> return;
>
> ecore_hw_remove(edev);
> + ecore_mz_mapping_free();
> }
>
> static int qed_send_drv_state(struct ecore_dev *edev, bool active)
> diff --git a/lib/eal/common/eal_common_memzone.c b/lib/eal/common/eal_common_memzone.c
> index a9cd91f..f94330b 100644
> --- a/lib/eal/common/eal_common_memzone.c
> +++ b/lib/eal/common/eal_common_memzone.c
> @@ -22,6 +22,10 @@
> #include "eal_private.h"
> #include "eal_memcfg.h"
>
> +#define RTE_DEFAULT_MAX_MEMZONE 2560
No need for a RTE_ prefix for a local macro and ...
> +
> +static size_t memzone_max = RTE_DEFAULT_MAX_MEMZONE;
... in the end, I see no need for the RTE_DEFAULT_MAX_MEMZONE macro at
all, it is only used as the default init value, here.
> +
> static inline const struct rte_memzone *
> memzone_lookup_thread_unsafe(const char *name)
> {
> @@ -81,8 +85,9 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
> /* no more room in config */
> if (arr->count >= arr->len) {
> RTE_LOG(ERR, EAL,
> - "%s(): Number of requested memzone segments exceeds RTE_MAX_MEMZONE\n",
> - __func__);
> + "%s(): Number of requested memzone segments exceeds max "
> + "memzone segments (%d >= %d)\n",
Won't we always display this log for the case when arr->count == arr->len ?
It is then pointless to display both arr->count and arr->len (which
should be formatted as %u).
I would simply log:
RTE_LOG(ERR, EAL,
"%s(): Number of requested memzone segments exceeds maximum %u\n",
__func__, arr->len);
> + __func__, arr->count, arr->len);
> rte_errno = ENOSPC;
> return NULL;
> }
> @@ -396,7 +401,7 @@ rte_eal_memzone_init(void)
>
> if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> rte_fbarray_init(&mcfg->memzones, "memzone",
> - RTE_MAX_MEMZONE, sizeof(struct rte_memzone))) {
> + rte_memzone_max_get(), sizeof(struct rte_memzone))) {
> RTE_LOG(ERR, EAL, "Cannot allocate memzone list\n");
> ret = -1;
> } else if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> @@ -430,3 +435,20 @@ void rte_memzone_walk(void (*func)(const struct rte_memzone *, void *),
> }
> rte_rwlock_read_unlock(&mcfg->mlock);
> }
> +
> +int
> +rte_memzone_max_set(size_t max)
> +{
> + /* Setting max memzone must occur befaore calling rte_eal_init() */
before*
This comment would be better placed in the API description than in the
implementation.
> + if (eal_get_internal_configuration()->init_complete > 0)
> + return -1;
> +
> + memzone_max = max;
> + return 0;
> +}
> +
> +size_t
> +rte_memzone_max_get(void)
> +{
> + return memzone_max;
> +}
> diff --git a/lib/eal/include/rte_memzone.h b/lib/eal/include/rte_memzone.h
> index 5302caa..3ff7c73 100644
> --- a/lib/eal/include/rte_memzone.h
> +++ b/lib/eal/include/rte_memzone.h
> @@ -305,6 +305,26 @@ void rte_memzone_dump(FILE *f);
> void rte_memzone_walk(void (*func)(const struct rte_memzone *, void *arg),
> void *arg);
>
> +/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
> + * Set max memzone value
> + *
> + * @param max
> + * Value of max memzone allocations
I'd rather describe as:
"Maximum number of memzones"
Please also mention that this function can only be called prior to
rte_eal_init().
> + * @return
> + * 0 on success, -1 otherwise
> + */
> +__rte_experimental
> +int rte_memzone_max_set(size_t max);
> +
> +/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
> + * Get max memzone value
Get the maximum number of memzones.
And we can remind the developer, here, that this value won't change
after rte_eal_init.
> + *
> + * @return
> + * Value of max memzone allocations
> + */
> +__rte_experimental
> +size_t rte_memzone_max_get(void);
> +
> #ifdef __cplusplus
> }
> #endif
> diff --git a/lib/eal/version.map b/lib/eal/version.map
> index 51a820d..b52a83c 100644
> --- a/lib/eal/version.map
> +++ b/lib/eal/version.map
> @@ -430,6 +430,10 @@ EXPERIMENTAL {
> rte_thread_create_control;
> rte_thread_set_name;
> __rte_eal_trace_generic_blob;
> +
> + # added in 23.07
> + rte_memzone_max_set;
> + rte_memzone_max_get;
> };
>
> INTERNAL {
> --
> 2.8.4
>
--
David Marchand
More information about the dev
mailing list