[dpdk-users] Slow DPDK startup with many 1G hugepages
Imre Pinter
imre.pinter at ericsson.com
Thu Jun 1 09:55:20 CEST 2017
Hi,
We experience slow startup time in DPDK-OVS, when backing memory with 1G hugepages instead of 2M hugepages.
Currently we're mapping 2M hugepages as memory backend for DPDK OVS. In the future we would like to allocate this memory from the 1G hugepage pool. Currently in our deployments we have significant amount of 1G hugepages allocated (min. 54G) for VMs and only 2G memory on 2M hugepages.
Typical setup for 2M hugepages:
GRUB:
hugepagesz=2M hugepages=1024 hugepagesz=1G hugepages=54 default_hugepagesz=1G
$ grep hugetlbfs /proc/mounts
nodev /mnt/huge_ovs_2M hugetlbfs rw,relatime,pagesize=2M 0 0
nodev /mnt/huge_qemu_1G hugetlbfs rw,relatime,pagesize=1G 0 0
Typical setup for 1GB hugepages:
GRUB:
hugepagesz=1G hugepages=56 default_hugepagesz=1G
$ grep hugetlbfs /proc/mounts
nodev /mnt/huge_qemu_1G hugetlbfs rw,relatime,pagesize=1G 0 0
DPDK OVS startup times based on the ovs-vswitchd.log logs:
* 2M (2G memory allocated) - startup time ~3 sec:
2017-05-03T08:13:50.177Z|00009|dpdk|INFO|EAL ARGS: ovs-vswitchd -c 0x1 --huge-dir /mnt/huge_ovs_2M --socket-mem 1024,1024
2017-05-03T08:13:50.708Z|00010|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports recirculation
* 1G (56G memory allocated) - startup time ~13 sec:
2017-05-03T08:09:22.114Z|00009|dpdk|INFO|EAL ARGS: ovs-vswitchd -c 0x1 --huge-dir /mnt/huge_qemu_1G --socket-mem 1024,1024
2017-05-03T08:09:32.706Z|00010|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports recirculation
I used DPDK 16.11 for OVS and testpmd and tested on Ubuntu 14.04 with kernel 3.13.0-117-generic and 4.4.0-78-generic.
We had a discussion with Mark Gray (from Intel), and he come up with the following items:
· The ~10 sec time difference is there with testpmd as well
· They believe it is a kernel overhead (mmap is slow, perhaps it is zeroing pages). The following code from eal_memory.c does the above mentioned printout in EAL startup:
469 /* map the segment, and populate page tables,
470 * the kernel fills this segment with zeros */
468 uint64_t start = rte_rdtsc();
471 virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
472 MAP_SHARED | MAP_POPULATE, fd, 0);
473 if (virtaddr == MAP_FAILED) {
474 RTE_LOG(DEBUG, EAL, "%s(): mmap failed: %s\n", __func__,
475 strerror(errno));
476 close(fd);
477 return i;
478 }
479
480 if (orig) {
481 hugepg_tbl[i].orig_va = virtaddr;
482 printf("Original mapping of page %u took: %"PRIu64" ticks, %"PRIu64" ms\n ",
483 i, rte_rdtsc() - start,
484 (rte_rdtsc() - start) * 1000 /
485 rte_get_timer_hz());
486 }
A solution could be to mount 1G hugepages to 2 separate directory: 2G for OVS and the remaining for the VMs, but the NUMA location for these hugepages is non-deterministic. Since mount cannot handle NUMA related parameters during mounting hugetlbfs, and fstab forks the mounts during boot.
Do you have a solution on how to use 1G hugepages for VMs and have reasonable DPDK EAL startup time?
Thanks,
Imre
More information about the users
mailing list