[dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0?
Stefan Puiu
stefan.puiu at gmail.com
Tue Feb 10 17:53:39 CET 2015
Hi and thanks for replying,
On Fri, Feb 6, 2015 at 1:25 PM, Olivier MATZ <olivier.matz at 6wind.com> wrote:
> Hi,
>
> On 02/06/2015 12:00 PM, Bruce Richardson wrote:
>> On Wed, Feb 04, 2015 at 05:24:58PM +0200, Stefan Puiu wrote:
>>> Hi,
>>>
>>> I'm trying to alter an existing program to use the Intel DPDK. I'm
>>> using 1.8.0, compiled by me as a shared library
>>> (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in
>>> .config) on Ubuntu 12.04. The program needs to allocate large blocks
>>> of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5
>>> GB). I tried changing my C++ code to use an array allocated using
>>> rte_malloc() instead of the std::vector I was using beforehand, but it
>>> seems the call to rte_malloc() fails. I then made a simple test
>>> program using the DPDK that takes a size to allocate and if that
>>> fails, tries again with sizes of 100MB less, basically the code below.
>>> This is C++ code (well, now that I look it could've been plain C, but
>>> I need C++) compiled with g++-4.6 with '-std=gnu++0x':
>>>
>>> int main(int argc, char **argv)
>>> {
>>> int ret = rte_eal_init(argc, argv);
>>> if (ret < 0)
>>> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
>>> argc -= ret;
>>> argv += ret;
>>>
>>> [... check argc >= 2]
>>> size_t size = strtoul(argv[1], NULL, 10);
>>> size_t s = size;
>>>
>>> for (size_t i = 0; i < 30; ++i) {
>>> printf("Trying to allocate %'zu bytes\n", s);
>>> buf = rte_malloc("test", s, 0);
>>> if (!buf)
>>> printf ("Failed!\n");
>>> else {
>>> printf ("Success!\n");
>>> rte_free(buf);
>>> break;
>>> }
>>>
>>> s = s - (100 * 1024ULL * 1024ULL);
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> I'm getting:
>>> Trying to allocate 4,832,038,656 bytes
>>> Failed!
>>> Trying to allocate 4,727,181,056 bytes
>>> Failed!
>>> [...]
>>> Trying to allocate 2,944,601,856 bytes
>>> Success!
>>>
>>> It's not always the same value, but usually somewhere around 3GB
>>> rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine
>>> with 2 physical CPUs, each having 64GBs of local memory. The machine
>>> also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB:
>>>
>>> echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
>>>
>>> I'm running the basic app like this:
>>>
>>> sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w
>>> 04:00.0 --socket-mem=16384,0 -- 4832038656
>>>
>>> I'm trying to run only on NUMA node 0 and only allocate memory from
>>> there - that's what the app I'm moving to the DPDK works like (using
>>> numactl --membind=x and --cpunodebind=x).
>>>
>>> Is there an upper limit on the amount of memory rte_malloc() will try
>>> to allocate? I tried both after a reboot and when the machine had been
>>> running for a while with not much success. Am I missing something?
>>> It's a bit weird to be only able to allocate 3GB out of the 32GB
>>> assigned to the app...
>>>
>>> On a related note, what would be a good way to compile the DPDK with
>>> debug info (and preferably -O0)? There's quite a web of .mk files used
>>> and I haven't figured out where the optimization level / debug options
>>> are set.
>>>
>>> Thanks in advance,
>>> Stefan.
>>
>> Does your system support 1G pages? I would recommend using a smaller number of
>> 1G pages vs the huge number of 2MB pages that you are currently using. There
>> may be issues with the allocations failing due to a lack of contiguous blocks
>> of memory due to the 2MB pages being spread across memory.
>
> Indeed, rte_malloc() tries to allocate memory which is physically
> contiguous. Using 1G pages instead of 2MB pages will probably help
> as Bruce suggests. Another idea is to use another allocation method.
> It depends on what you want to do with the allocated data (accessed in
> dataplane or not), and when you allocate it (in dataplane or not).
I wanted to use the memory on the dataplane, yes. I'm not allocating
it on startup, but
when I receive a certain external command. The workers won't be
processing packets
at that point, though - there's another command for starting Rx.
>
> For instance, if you want to allocate a large zone at init, you can just
> mmap() and anonymous zone in hugetlbfs (your dpdk config need to keep
> unused huge pages for this usage).
Yep, I think I'm going to use hugetlbfs, I've tried a simple test
program and it was successful in allocating the amount I wanted.
Hopefully get_hugepage_region() honors the mempolicy.
Thanks,
Stefan.
More information about the dev
mailing list