[PATCH v3] eventdev: avoid non-burst shortcut for variable-size bursts

Mattias Rönnblom hofors at lysator.liu.se
Fri May 12 16:52:53 CEST 2023


On 2023-05-12 15:56, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnblom at ericsson.com]
>> Sent: Friday, 12 May 2023 15.15
>>
>> On 2023-05-12 13:59, Jerin Jacob wrote:
>>> On Thu, May 11, 2023 at 2:00 PM Mattias Rönnblom
>>> <mattias.ronnblom at ericsson.com> wrote:
>>>>
>>>> Use non-burst event enqueue and dequeue calls from burst enqueue and
>>>> dequeue only when the burst size is compile-time constant (and equal
>>>> to one).
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>>
>>>> ---
>>>>
>>>> v3: Actually include the change v2 claimed to contain.
>>>> v2: Wrap builtin call in __extension__, to avoid compiler warnings if
>>>>       application is compiled with -pedantic. (Morten Brørup)
>>>> ---
>>>>    lib/eventdev/rte_eventdev.h | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
>>>> index a90e23ac8b..a471caeb6d 100644
>>>> --- a/lib/eventdev/rte_eventdev.h
>>>> +++ b/lib/eventdev/rte_eventdev.h
>>>> @@ -1944,7 +1944,7 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t
>> port_id,
>>>>            * Allow zero cost non burst mode routine invocation if
>> application
>>>>            * requests nb_events as const one
>>>>            */
>>>> -       if (nb_events == 1)
>>>> +       if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
>> 1)
>>>
>>> "Why" part is not clear from the commit message. Is this to avoid
>>> nb_events read if it is built-in const.
>>
>> The __builtin_constant_p() is introduced to avoid having the compiler
>> generate a conditional branch and two different code paths in case
>> nb_elem is a run-time variable.
>>
>> In particular, this matters if nb_elems is run-time variable and varies
>> between 1 and some larger value.
>>
>> I should have mention this in the commit message.
>>
>> A very slight performance improvement. It also makes the code better
>> match the comment, imo. Zero cost for const one enqueues, but no impact
>> non-compile-time-constant-length enqueues.
>>
>> Feel free to ignore.
>>
>>> If so, check should be following. Right?
>>>
>>> if (__extension__((__builtin_constant_p(nb_events)) && nb_events == 1)
>>> || nb_events  == 1)
> 
> @Mattias: You missed the second part of this comparison, also catching nb_events == 1 with non-constant nb_events.
> 

I didn't comment on that code snippet since it was based on a 
misconception of the intention of my patch.

> @Jerin: Such a change has no effect, compared to the original code.
> 
>>>
>>> At least, It was my original intention in the code.
> 
> @Jerin: Mattias implemented exactly what the comment says.
> 
> Perhaps only the comment should be updated, not the code.
> 
> Is nb_events likely to be non-constant 1, and are there benefits to calling either of the non-burst functions in those cases, vs. the branch cost of this comparison (which Mattias' patch gets rid of)?
> 

I think the main worry would be the cost of branch mispredictions in
case of alternating enqueue sizes (between 1 and some other size).

If there is a performance upside to calling single-event enqueue in a 
scenario where all enqueues are *run-time variable* and 1 (which I find 
unlikely, but well inside the realms of the possibility), the next 
question would be: OK, but how about for two events? Three? Four. Etc.

>>>
>>>
>>>
>>>>                   return (fp_ops->enqueue)(port, ev);
>>>>           else
>>>>                   return fn(port, ev, nb_events);
>>>> @@ -2200,7 +2200,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t
>> port_id, struct rte_event ev[],
>>>>            * Allow zero cost non burst mode routine invocation if
>> application
>>>>            * requests nb_events as const one
>>>>            */
>>>> -       if (nb_events == 1)
>>>> +       if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
>> 1)
>>>>                   return (fp_ops->dequeue)(port, ev, timeout_ticks);
>>>>           else
>>>>                   return (fp_ops->dequeue_burst)(port, ev, nb_events,
>>>> --
>>>> 2.34.1
>>>>
> 


More information about the dev mailing list