Hugepage migration

Stephen Hemminger stephen at networkplumber.org
Tue May 30 03:35:14 CEST 2023


On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <baruch at weka.io> wrote:

> Hi,
> 
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
> 
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
>     mm: make alloc_contig_range handle in-use hugetlb pages
> 
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
> 
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
> 
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
> 
> Thanks,
> Baruch
> 

Fix might be as simple as asking kernel to lock the mmap().

diff --git a/lib/eal/linux/eal_hugepage_info.c b/lib/eal/linux/eal_hugepage_info.c
index 581d9dfc91eb..989c69387233 100644
--- a/lib/eal/linux/eal_hugepage_info.c
+++ b/lib/eal/linux/eal_hugepage_info.c
@@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t mem_size, int flags)
 		return NULL;
 	}
 	retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
-			MAP_SHARED, fd, 0);
+			MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
+
 	close(fd);
 	return retval == MAP_FAILED ? NULL : retval;
 }


More information about the dev mailing list