[PATCH v3 3/7] eal: add lcore variable performance test
Mattias Rönnblom
hofors at lysator.liu.se
Fri Sep 13 08:47:29 CEST 2024
On 2024-09-12 17:11, Jerin Jacob wrote:
> On Thu, Sep 12, 2024 at 6:50 PM Mattias Rönnblom <hofors at lysator.liu.se> wrote:
>>
>> On 2024-09-12 15:09, Jerin Jacob wrote:
>>> On Thu, Sep 12, 2024 at 2:34 PM Mattias Rönnblom
>>> <mattias.ronnblom at ericsson.com> wrote:
>>>>
>>>> Add basic micro benchmark for lcore variables, in an attempt to assure
>>>> that the overhead isn't significantly greater than alternative
>>>> approaches, in scenarios where the benefits aren't expected to show up
>>>> (i.e., when plenty of cache is available compared to the working set
>>>> size of the per-lcore data).
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>> ---
>>>> app/test/meson.build | 1 +
>>>> app/test/test_lcore_var_perf.c | 160 +++++++++++++++++++++++++++++++++
>>>> 2 files changed, 161 insertions(+)
>>>> create mode 100644 app/test/test_lcore_var_perf.c
>>>
>>>
>>>> +static double
>>>> +benchmark_access_method(void (*init_fun)(void), void (*update_fun)(void))
>>>> +{
>>>> + uint64_t i;
>>>> + uint64_t start;
>>>> + uint64_t end;
>>>> + double latency;
>>>> +
>>>> + init_fun();
>>>> +
>>>> + start = rte_get_timer_cycles();
>>>> +
>>>> + for (i = 0; i < ITERATIONS; i++)
>>>> + update_fun();
>>>> +
>>>> + end = rte_get_timer_cycles();
>>>
>>> Use precise variant. rte_rdtsc_precise() or so to be accurate
>>
>> With 1e7 iterations, do you need rte_rdtsc_precise()? I suspect not.
>
> I was thinking in another way, with 1e7 iteration, the additional
> barrier on precise will be amortized, and we get more _deterministic_
> behavior e.s.p in case if we print cycles and if we need to catch
> regressions.
If you time a section of code which spends ~40000000 cycles, it doesn't
matter if you add or remove a few cycles at the beginning and the end.
The rte_rdtsc_precise() is both better (more precise in the sense of
more serialization), and worse (because it's more costly, and thus more
intrusive).
You can use rte_rdtsc_precise(), rte_rdtsc(), or gettimeofday(). It
doesn't matter.
> Furthermore, you may consider replacing rte_random() in fast path to
> running number or so if it is not deterministic in cycle computation.
rte_rand() is not used in the fast path. I don't understand what you
mean by "running number".
More information about the dev
mailing list