[PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem
John Romein
romein at astron.nl
Mon Oct 7 14:47:27 CEST 2024
Dear Stephen,
The problem has not been solved, but I found a workaround. According to
the documentation (https://doc.dpdk.org/guides/prog_guide/gpudev.html,
sec 11.3), rte_extmem_register should be invoked with GPU_PAGE_SIZE as
an argument. If GPU_PAGE_SIZE is set to 2 MB instead of 64 kB,
registration of 72 GB of GPU memory (on a Grace Hopper) is done in about
ten seconds, not hours.
rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len, NULL, ext_mem.buf_iova, GPU_PAGE_SIZE);
Thanks, John Romein
On 05-10-2024 00:16, Stephen Hemminger wrote:
> On Wed, 1 Nov 2023 22:21:16 +0100
> John Romein<romein at astron.nl> wrote:
>
>> Dear Slava,
>>
>> Thank you for looking at the patch. With the original code, I saw that
>> the application spent literally hours in this function during program
>> start up, if tens of gigabytes of GPU memory are registered. This was
>> due to qsort being invoked for every new added item (to keep the list
>> sorted). So I tried to write equivalent code that sorts the list only
>> once, after all items were added. At least for our application, this
>> works well and is /much/ faster, as the complexity decreased from n^2
>> log(n) to n log(n). But I must admit that I have no idea /what/ is
>> being sorted, or why; I only understand this isolated piece of code (or
>> at least I think so). So if you think there are better ways to
>> initialize the list, then I am sure you will be absolutely right. But I
>> will not be able to implement this, as I do not understand the full
>> context of the code.
>>
>> Kind Regards, John
> Looks like the problem remains but patch has been sitting around for 11 months.
> Was this resolved?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20241007/88956a1b/attachment.htm>
More information about the dev
mailing list