[PATCH v3 3/7] eal: add lcore variable performance test

Mattias Rönnblom hofors at lysator.liu.se
Fri Sep 13 08:47:29 CEST 2024


On 2024-09-12 17:11, Jerin Jacob wrote:
> On Thu, Sep 12, 2024 at 6:50 PM Mattias Rönnblom <hofors at lysator.liu.se> wrote:
>>
>> On 2024-09-12 15:09, Jerin Jacob wrote:
>>> On Thu, Sep 12, 2024 at 2:34 PM Mattias Rönnblom
>>> <mattias.ronnblom at ericsson.com> wrote:
>>>>
>>>> Add basic micro benchmark for lcore variables, in an attempt to assure
>>>> that the overhead isn't significantly greater than alternative
>>>> approaches, in scenarios where the benefits aren't expected to show up
>>>> (i.e., when plenty of cache is available compared to the working set
>>>> size of the per-lcore data).
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>> ---
>>>>    app/test/meson.build           |   1 +
>>>>    app/test/test_lcore_var_perf.c | 160 +++++++++++++++++++++++++++++++++
>>>>    2 files changed, 161 insertions(+)
>>>>    create mode 100644 app/test/test_lcore_var_perf.c
>>>
>>>
>>>> +static double
>>>> +benchmark_access_method(void (*init_fun)(void), void (*update_fun)(void))
>>>> +{
>>>> +       uint64_t i;
>>>> +       uint64_t start;
>>>> +       uint64_t end;
>>>> +       double latency;
>>>> +
>>>> +       init_fun();
>>>> +
>>>> +       start = rte_get_timer_cycles();
>>>> +
>>>> +       for (i = 0; i < ITERATIONS; i++)
>>>> +               update_fun();
>>>> +
>>>> +       end = rte_get_timer_cycles();
>>>
>>> Use precise variant. rte_rdtsc_precise() or so to be accurate
>>
>> With 1e7 iterations, do you need rte_rdtsc_precise()? I suspect not.
> 
> I was thinking in another way, with 1e7 iteration, the additional
> barrier on precise will be amortized, and we get more _deterministic_
> behavior e.s.p in case if we print cycles and if we need to catch
> regressions.

If you time a section of code which spends ~40000000 cycles, it doesn't 
matter if you add or remove a few cycles at the beginning and the end.

The rte_rdtsc_precise() is both better (more precise in the sense of 
more serialization), and worse (because it's more costly, and thus more 
intrusive).

You can use rte_rdtsc_precise(), rte_rdtsc(), or gettimeofday(). It 
doesn't matter.

> Furthermore, you may consider replacing rte_random() in fast path to
> running number or so if it is not deterministic in cycle computation.

rte_rand() is not used in the fast path. I don't understand what you 
mean by "running number".


More information about the dev mailing list