[PATCH] net: increase the maximum of RX/TX descriptors

Lukáš Šišmiš sismis at cesnet.cz
Tue Nov 5 22:20:38 CET 2024


On 05. 11. 24 17:50, Morten Brørup wrote:
>> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>> Sent: Tuesday, 5 November 2024 16.55
>>
>> On Tue, 5 Nov 2024 09:49:39 +0100
>> Morten Brørup <mb at smartsharesystems.com> wrote:
>>
>>>> I suspect AF_PACKET provides an intermediate step which can buffer
>> more
>>>> or spread out the work.
>>> Agree. It's a Linux scheduling issue.
>>>
>>> With DPDK polling, there is no interrupt in the kernel scheduler.
>>> If the CPU core running the DPDK polling thread is running some other
>> thread when the packets arrive on the hardware, the DPDK polling thread
>> is NOT scheduled immediately, but has to wait for the kernel scheduler
>> to switch to this thread instead of the other thread.
>>> Quite a lot of time can pass before this happens - the kernel
>> scheduler does not know that the DPDK polling thread has urgent work
>> pending.
>>> And the number of RX descriptors needs to be big enough to absorb all
>> packets arriving during the scheduling delay.
>>> It is not well described how to *guarantee* that nothing but the DPDK
>> polling thread runs on a dedicated CPU core.
>>
>> That why any non-trivial DPDK application needs to run on isolated
>> cpu's.
> Exactly.
> And it is non-trivial and not well described how to do this.
>
> Especially in virtual environments.
> E.g. I ran some scheduling latency tests earlier today, and frequently observed 500-1000 us scheduling latency under vmware vSphere ESXi. This requires a large number of RX descriptors to absorb without packet loss. (Disclaimer: The virtual machine configuration had not been optimized. Tweaking the knobs offered by the hypervisor might improve this.)
>
> The exact same firmware (same kernel, rootfs, libraries, applications etc.) running directly on our purpose-built hardware has scheduling latency very close to the kernel's default "timerslack" (50 us).
>
Thanks for the feedback, I am currently not 100% I ran my earlier 
experiments on isolcpus and whether it had a massive impact or not.

But here is a decent guide on latency tuning I found the other day 
though virtual environments are not exactly described.

https://rigtorp.se/low-latency-guide/



More information about the dev mailing list