[dpdk-dev] A (possible) problem with `--no-huge` option

Ilya Matveychikov matvejchikov at gmail.com
Fri Jun 9 14:08:24 CEST 2017


Hi Olivier,

The patch from you solves the problem for me.

Thank you.

> On Jun 9, 2017, at 12:27 PM, Olivier Matz <olivier.matz at 6wind.com> wrote:
> 
> Hi Ilya,
> 
> On Sun, 14 May 2017 14:34:14 +0400, Ilya Matveychikov <matvejchikov at gmail.com> wrote:
>> Hi guys,
>> 
>> I have a problem while running DPDK with `--no-huge` option. It seems that the problem occurs since commit cdc242f260e766bd95a658b5e0686a62ec04f5b0 and that is the change that affects me:
>> 
>> +	if ((page & 0x7fffffffffffffULL) == 0)
>> +		return RTE_BAD_PHYS_ADDR;
>> +
>> 
>> What I did is to try to create memory pool using rte_pktmbuf_pool_create(). I dig into the issue and found that in my case “page" value is 0x0080000000000000 which means that the page is not present and “soft-dirty” (according to kernel’s documentation):
>> 
>>   * Bits 0-54  page frame number (PFN) if present
>>   * Bits 0-4   swap type if swapped
>>   * Bits 5-54  swap offset if swapped
>>   * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
>>   * Bit  56    page exclusively mapped (since 4.2)
>>   * Bits 57-60 zero
>>   * Bit  61    page is file-page or shared-anon (since 3.5)
>>   * Bit  62    page swapped
>>   * Bit  63    page present
>> 
>> So, before the change mentioned all “works” fine and such pages were not handled. But now the check causes rte_mempool_populate_default to fail with -EINVAL...
>> Can anyone familiar with the memory pool allocation helps with the issue?
>> 
>> Thanks in advice,
>> Ilya Matveychikov.
>> 
> 
> I can reproduce the issue:
> 
>  make config T=x86_64-native-linuxapp-gcc
>  make -j32 EXTRA_CFLAGS="-O0 -g"
>  mkdir -p /mnt/huge
>  mount -t hugetlbfs nodev /mnt/huge
>  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> 
>  # ok
>  ./build/app/testpmd -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
> 
>  # fail
>  ./build/app/testpmd --no-huge -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
> 
> 
> I confirm that rte_mem_virt2phy() returns RTE_BAD_PHYS_ADDR,
> which makes rte_mempool_populate_virt() to fail.
> 
> Reverting cdc242f260e7 ("eal/linux: support running as unprivileged user")
> fixes the problem. Actually, it makes rte_mem_virt2phy() return 0 instead
> of RTE_BAD_PHYS_ADDR, which is seen as a valid address.
> 
> I think querying the physical address when using --no-huge does not make
> sense because the memory is not locked, and could be swapped.
> 
> Another strange thing, when using --no-huge, the physical address returned
> when allocating a memzone is the virtual address.
> 
> I see several solutions to fix the issue:
> 
> 1/ Always set physical addresses to RTE_BAD_PHYS_ADDR when started
>   with --no-huge. We consider that the physical address is invalid
>   in that case and must not be used.
> 
>   This impacts rte_mem_virt2phy() and memzone_reserve*() functions.
> 
>   In rte_mempool_populate_virt(), don't expect a physical address
>   if the application is started with --no-huge.
> 
> 2/ Change rte_mem_virt2phy() to return the virtual address when we
>   ask for the physical address when started with --no-huge. This is
>   wrong, but consistent with what is done in memzones today.
> 
>   In rte_mem_virt2phy(), add at the beginning:
> 
>     if (!rte_eal_has_hugepages())
>         return (intptr_t)virtaddr;
> 
> 3/ lock pages in memory by reverting
>   729f17a932dd ("mem: revert page locking when not using hugepages")
> 
>   This would make the physical address available.
>   As explained in the commit log, this would also break the ability to
>   start dpdk with --no-huge for non-root users.
> 
> 
> I think 1/ is better. I'm sending a patch in reply to this mail.
> Ilya, please let me know if it fixes your issue.
> 
> Regards,
> Olivier



More information about the dev mailing list