[PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem

John Romein romein at astron.nl
Mon Oct 7 14:47:27 CEST 2024


Dear Stephen,

The problem has not been solved, but I found a workaround. According to 
the documentation (https://doc.dpdk.org/guides/prog_guide/gpudev.html, 
sec 11.3), rte_extmem_register should be invoked with GPU_PAGE_SIZE as 
an argument.  If GPU_PAGE_SIZE is set to 2 MB instead of 64 kB, 
registration of 72 GB of GPU memory (on a Grace Hopper) is done in about 
ten seconds, not hours.

   rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len, NULL, ext_mem.buf_iova, GPU_PAGE_SIZE);


Thanks,  John Romein

On 05-10-2024 00:16, Stephen Hemminger wrote:
> On Wed, 1 Nov 2023 22:21:16 +0100
> John Romein<romein at astron.nl> wrote:
>
>> Dear Slava,
>>
>> Thank you for looking at the patch.  With the original code, I saw that
>> the application spent literally hours in this function during program
>> start up, if tens of gigabytes of GPU memory are registered.  This was
>> due to qsort being invoked for every new added item (to keep the list
>> sorted).  So I tried to write equivalent code that sorts the list only
>> once, after all items were added.  At least for our application, this
>> works well and is /much/ faster, as the complexity decreased from n^2
>> log(n) to n log(n).  But I must admit that I have no idea /what/ is
>> being sorted, or why; I only understand this isolated piece of code (or
>> at least I think so).  So if you think there are better ways to
>> initialize the list, then I am sure you will be absolutely right.  But I
>> will not be able to implement this, as I do not understand the full
>> context of the code.
>>
>> Kind Regards,  John
> Looks like the problem remains but patch has been sitting around for 11 months.
> Was this resolved?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20241007/88956a1b/attachment.htm>


More information about the dev mailing list