[dpdk-dev] [PATCH] mem: fix undefined behavior in NUMA code

Ilya Maximets i.maximets at samsung.com
Wed Aug 29 15:02:07 CEST 2018


Hi.
Thanks for the fix.
Comments inline.

Best regards, Ilya Maximets.

On 29.08.2018 15:21, Anatoly Burakov wrote:
> When NUMA-aware hugepages config option is set, we rely on
> libnuma to tell the kernel to allocate hugepages on a specific
> NUMA node. However, we allocate node mask before we check if
> NUMA is available in the first place, which, according to
> the manpage [1], causes undefined behaviour.
> 
> Fix by only using nodemask when we have NUMA available.
> 
> [1] https://linux.die.net/man/3/numa_alloc_onnode
> 
> Bugzilla ID: 20
> 
> Fixes: 1b72605d2416 ("mem: balanced allocation of hugepages")
> Cc: i.maximets at samsung.com
> Cc: stable at dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 28 ++++++++++++++----------
>  1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index dbf19499e..4976eeacd 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -263,7 +263,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
>  	int node_id = -1;
>  	int essential_prev = 0;
>  	int oldpolicy;
> -	struct bitmask *oldmask = numa_allocate_nodemask();
> +	struct bitmask *oldmask = NULL;
>  	bool have_numa = true;
>  	unsigned long maxnode = 0;
>  
> @@ -275,6 +275,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
>  
>  	if (have_numa) {
>  		RTE_LOG(DEBUG, EAL, "Trying to obtain current memory policy.\n");
> +		oldmask = numa_allocate_nodemask();
>  		if (get_mempolicy(&oldpolicy, oldmask->maskp,
>  				  oldmask->size + 1, 0, 0) < 0) {
>  			RTE_LOG(ERR, EAL,
> @@ -390,19 +391,22 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
>  
>  out:
>  #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
> -	if (maxnode) {
> -		RTE_LOG(DEBUG, EAL,
> -			"Restoring previous memory policy: %d\n", oldpolicy);
> -		if (oldpolicy == MPOL_DEFAULT) {
> -			numa_set_localalloc();
> -		} else if (set_mempolicy(oldpolicy, oldmask->maskp,
> -					 oldmask->size + 1) < 0) {
> -			RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
> -				strerror(errno));
> -			numa_set_localalloc();
> +	if (have_numa) {
> +		if (maxnode) {
> +			RTE_LOG(DEBUG, EAL,
> +				"Restoring previous memory policy: %d\n",
> +					oldpolicy);
> +			if (oldpolicy == MPOL_DEFAULT) {
> +				numa_set_localalloc();
> +			} else if (set_mempolicy(oldpolicy, oldmask->maskp,
> +						 oldmask->size + 1) < 0) {
> +				RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
> +					strerror(errno));
> +				numa_set_localalloc();
> +			}
>  		}
> +		numa_free_cpumask(oldmask);
>  	}
> -	numa_free_cpumask(oldmask);

The original intend was to avoid ugly nested 'if's as possible.
'maxnode' is only initialized in NUMA case. So, there is no need
to check for 'has_numa'. 'numa_free_cpumask' has 'free' semantics
and checks for the argument. It is safe to call it with NULL.
If you want to be fully compliant with man page, you may use less
invasive change like this:

---
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index dbf19499e..d0b9f3a2f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -390,7 +390,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
 
 out:
 #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
-       if (maxnode) {
+       if (have_numa && maxnode) {
                RTE_LOG(DEBUG, EAL,
                        "Restoring previous memory policy: %d\n", oldpolicy);
                if (oldpolicy == MPOL_DEFAULT) {
@@ -402,7 +402,8 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
                        numa_set_localalloc();
                }
        }
-       numa_free_cpumask(oldmask);
+       if (oldmask)
+               numa_free_cpumask(oldmask);
 #endif
        return i;
 }
---

But still, checking both 'have_numa && maxnode', IMHO, is unnecessary.

As this change is cosmetic (issue doesn't produce any real bug),
I'd like to avoid changing the functional code to something less readable.
This also will complicate 'git blame' process.

What do you think?

>  #endif
>  	return i;
>  }
> 


More information about the dev mailing list