[RFC v2 0/2] Add high-performance timer facility

Mattias Rönnblom hofors at lysator.liu.se
Sun Oct 6 16:43:37 CEST 2024


On 2024-10-06 15:43, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:hofors at lysator.liu.se]
>> Sent: Sunday, 6 October 2024 15.03
>>
>> On 2024-10-03 23:32, Morten Brørup wrote:
>>>> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>>>> Sent: Thursday, 3 October 2024 20.37
>>>>
>>>> On Wed, 15 Mar 2023 18:03:40 +0100
>>>> Mattias Rönnblom <mattias.ronnblom at ericsson.com> wrote:
>>>>
>>>>> This patchset is an attempt to introduce a high-performance, highly
>>>>> scalable timer facility into DPDK.
>>>>>
>>>>> More specifically, the goals for the htimer library are:
>>>>>
>>>>> * Efficient handling of a handful up to hundreds of thousands of
>>>>>     concurrent timers.
>>>>> * Make adding and canceling timers low-overhead, constant-time
>>>>>     operations.
>>>>> * Provide a service functionally equivalent to that of
>>>>>     <rte_timer.h>. API/ABI backward compatibility is secondary.
>>>>
>>>> Worthwhile goals, and the problem needs to be addressed.
>>>> But this patch never got accepted.
>>>
>>> I think work on it was put on hold due to the requested changes
>> requiring a significant development effort.
>>> I too look forward to work on this being resumed. ;-)
>>>
>>>>
>>>> Please fix/improve/extend existing rte_timer instead.
>>>
>>> The rte_timer API is too "fat" for use in the fast path with millions
>> of timers, e.g. TCP flow timers.
>>>
>>> Shoehorning a fast path feature into a slow path API is not going to
>> cut it. I support having a separate htimer library with its own API for
>> high volume, high-performance fast path timers.
>>>
>>> When striving for low latency across the internet, timing is
>> everything. Packet pacing is the "new" hot thing in congestion control
>> algorithms, and a simple software implementation would require a timer
>> firing once per packet.
>>>
>>
>> I think DPDK should have two public APIs in the timer area.
> 
> Agree.
> 
>> One is a
>> just a bare-bones hierarchical timer wheel API, without callbacks,
>> auto-created per-lcore instances, MT safety or any other of the
>> <rte_timer.h> bells and whistles. It also doesn't make any assumptions
>> about the time source (other it being monotonic) or resolution.
> 
> The <rte_timer.h> library does not - and is never going to - provide sufficient performance for timer intensive applications, such as packet pacing and fast path TCP/QUIC/whatever congestion control. It is too "fat" for this.
> 
> We need a new library with a new API for that.
> I agree with Mattias' description of the requirements for such a library.
> 
>>
>> The other is a new variant of <rte_timer.h>, using the core HTW library
>> for its implementation (and being public, it may also expose this
>> library in its header files, which may be required for efficient
>> operation). The new <rte_timer.h> would provide the same kind of
>> functionality as the old API, but with some quirks and bugs fixed, plus
>> potentially some new functionality added. For example, it would be
>> useful to allow non-preemption safe threads to add and remove timers
>> (something rte_timer and its spinlocks doesn't allow).
> 
> Agree.
> 
> Until that becomes part of DPDK, we will have to stick with what <rte_timer.h> currently offers.
> 
>>
>> I would consider both "fast path APIs".
>>
>> In addition, there should probably also be a time source API.
> 
> A third library, orthogonal to the two other timer libraries.
> But I see why you mention it: It could be somewhat related to the design and implementation of the <rte_timer.h> library.
> But, let's please forget about a time source API for now.
> 
>>
>> Considering the lead time of relatively small contributions like the
>> bitops extensions and the new bitset API (which still aren't in), I
>> can't imagine how long time it would take to get in a semi-backward
>> compatible rte_timer with a new implementation, plus a new timer wheel
>> library, into DPDK.
> 
> Well said!
> 
> Instead of aiming for an unreachable target, let's instead take this approach:
> - Provide the new high-performance HTW library as a stand-alone library.
> - Postpone improving the <rte_timer.h> library; it can be done any time in the future, if someone cares to do it. And it can use the HTW library or not, whichever is appropriate.
> 
> Doing both simultaneously would require a substantial effort, and would cause much backpressure from the community (due to the modified <rte_timer.h> API and implementation).
> 
> Although it might be beneficial for the design of the HTW library to consider how an improved <rte_timer.h> would use it, it is not the primary use case of the HTW library, so co-design is not a requirement here.
> 

Postponing rte_timer improvements would also mean postponing most of the 
benefits of the new timer wheel, in my opinion.

In most scenarios, I think you want to have all application modules 
sharing timer wheel instances, preferably without having to agree on a 
proprietary timer API. Here rte_timer shines.

Also, you want to get the HTW library *exactly* right for the rte_timer 
use case. Making it a public API would make changes to its API painful, 
to address any shortcomings you accidentally designed in. To be on the 
safe side, you would need to have a new rte_timer implementation ready 
upon submitting a HTW library.

That in turn would require a techboard ACK on the necessity of rte_timer 
API tweaks, otherwise all your work may be wasted.



More information about the dev mailing list