[dpdk-dev] [RFC] DPDK Trace support

Ray Kinsella mdr at ashroe.eu
Mon Jan 20 13:08:29 CET 2020


+1 - thanks Dave

On 20/01/2020 04:48, Jerin Jacob Kollanukkaran wrote:
>> -----Original Message-----
>> From: dave at barachs.net <dave at barachs.net>
>> Sent: Saturday, January 18, 2020 8:45 PM
>> To: 'Ray Kinsella' <mdr at ashroe.eu>; Jerin Jacob Kollanukkaran
>> <jerinj at marvell.com>; 'dpdk-dev' <dev at dpdk.org>
>> Subject: [EXT] RE: [RFC] [dpdk-dev] DPDK Trace support
>>
>> It would be well worth considering one of the vpp techniques to minimize trace
>> impact:
>>
>> static inline ring_handler_inline (..., int is_traced) {
>>   for (i = 0; i < vector_size; i++)
>>     {
>>       if (is_traced)
>> 	{
>> 	  do_trace_work;
>> 	}
>>       normal_packet_processing;
>>     }
>> }
>>
>> ring_handler (...)
>> {
>>   if (PREDICT_FALSE(global_trace_flag != 0))
>>     return ring_handler_inline (..., 1 /* is_traced */);
>>   else
>>     return ring_handler_inline (..., 0 /* is_traced */); }
>>
>> This reduces the runtime tax to the absolute minimum, but costs space.
>>
>> Please consider it.
> 
> Thanks Dave for your thoughts.
> 
>>
>> HTH... Dave
>>
>> -----Original Message-----
>> From: Ray Kinsella <mdr at ashroe.eu>
>> Sent: Monday, January 13, 2020 6:00 AM
>> To: Jerin Jacob Kollanukkaran <jerinj at marvell.com>; dpdk-dev
>> <dev at dpdk.org>; dave at barachs.net
>> Subject: Re: [RFC] [dpdk-dev] DPDK Trace support
>>
>> Hi Jerin,
>>
>> Any idea why lttng performance is so poor?
>> I would have naturally gone there to benefit from the existing toolchain.
>>
>> Have you looked at the FD.io logging/tracing infrastructure for inspiration?
>> https://urldefense.proofpoint.com/v2/url?u=https-
>> 3A__wiki.fd.io_view_VPP_elog&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=1
>> DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=b9wJHO_k_ijKT84q47_
>> fO7MrN-LddnfpVSuNh6ce6Ks&s=WNwcIA86Rk2TY_C7O4bNTj3055Ofutab-
>> bMPuM9-D4A&e=
>>
>> Ray K
>>
>> On 13/01/2020 10:40, Jerin Jacob Kollanukkaran wrote:
>>> Hi All,
>>>
>>> I would like to add tracing support for DPDK.
>>> I am planning to add this support in v20.05 release.
>>>
>>> This RFC attempts to get feedback from the community on
>>>
>>> a) Tracing Use cases.
>>> b) Tracing Requirements.
>>> b) Implementation choices.
>>> c) Trace format.
>>>
>>> Use-cases
>>> ---------
>>> - Most of the cases, The DPDK provider will not have access to the DPDK
>> customer applications.
>>> To debug/analyze the slow path and fast path DPDK API usage from the
>>> field, we need to have integrated trace support in DPDK.
>>>
>>> - Need a low overhead Fast path multi-core PMD driver
>>> debugging/analysis infrastructure in DPDK to fix the functional and
>> performance issue(s) of PMD.
>>>
>>> - Post trace analysis tools can provide various status across the
>>> system such as cpu_idle() using the timestamp added in the trace.
>>>
>>>
>>> Requirements:
>>> -------------
>>> - Support for Linux, FreeBSD and Windows OS
>>> - Open trace format
>>> - Multi-platform Open source trace viewer
>>> - Absolute low overhead trace API for DPDK fast path tracing/debugging.
>>> - Dynamic enable/disable of trace events
>>>
>>>
>>> To enable trace support in DPDK, following items need to work out:
>>>
>>> a) Add the DPDK trace points in the DPDK source code.
>>>
>>> - This includes updating DPDK functions such as,
>>> rte_eth_dev_configure(), rte_eth_dev_start(), rte_eth_dev_rx_burst() to emit
>> the trace.
>>>
>>> b) Choosing suitable serialization-format
>>>
>>> - Common Trace Format, CTF, is an open format and language to describe
>> trace formats.
>>> This enables tool reuse, of which line-textual (babeltrace) and
>>> graphical (TraceCompass) variants already exist.
>>>
>>> CTF should look familiar to C programmers but adds stronger typing.
>>> See CTF - A Flexible, High-performance Binary Trace Format.
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__diamon.org_ctf_&d
>>>
>> =DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=1DGob4H4rxz6H8uITozGOCa0s5f4
>> wCNtTa4
>>> UUKvcsvI&m=b9wJHO_k_ijKT84q47_fO7MrN-
>> LddnfpVSuNh6ce6Ks&s=QErjHnVHM1me2
>>> 4a6NGGIwiU6O5yot32ZW0vHbPnwZRg&e=
>>>
>>> c) Writing the on-target serialization code,
>>>
>>> See the section below.(Lttng CTF trace emitter vs DPDK specific CTF
>>> trace emitter)
>>>
>>> d) Deciding on and writing the I/O transport mechanics,
>>>
>>> For performance reasons, it should be backed by a huge-page and write to file
>> IO.
>>>
>>> e) Writing the PC-side deserializer/parser,
>>>
>>> Both the babletrace(CLI tool) and Trace Compass(GUI tool) support CTF.
>>> See:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lttng.org_viewers
>>>
>> _&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=1DGob4H4rxz6H8uITozGOCa0s
>> 5f4wCNt
>>> Ta4UUKvcsvI&m=b9wJHO_k_ijKT84q47_fO7MrN-
>> LddnfpVSuNh6ce6Ks&s=JCCywchwpf
>>> jb7Cta5ykYG-SHkMnNUyqPRHh9QAFIcXg&e=
>>>
>>> f) Writing tools for filtering and presentation.
>>>
>>> See item (e)
>>>
>>>
>>> Lttng CTF trace emitter vs DPDK specific CTF trace emitter
>>> ----------------------------------------------------------
>>>
>>> I have written a performance evaluation application to measure the
>>> overhead of Lttng CTF emitter(The fastpath infrastructure used by
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lttng.org_&d=DwIF
>>>
>> aQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtT
>> a4UUKvc
>>> svI&m=b9wJHO_k_ijKT84q47_fO7MrN-
>> LddnfpVSuNh6ce6Ks&s=dgfSVlEy8_W0IovAga
>>> TnUT2ZbwCojfHimNxuyp4w7gI&e=  library to emit the trace)
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jerinj
>>> acobk_lttng-
>> 2Doverhead&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=1DGob4H4rxz
>>> 6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=b9wJHO_k_ijKT84q47_fO7MrN-
>> LddnfpVSu
>>> Nh6ce6Ks&s=uSB4IwIan6cs9NuEUvGezK_jfdJj7Rjp0qrbThjk08M&e=
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jerinj
>>> acobk_lttng-
>> 2Doverhead_blob_master_README&d=DwIFaQ&c=nKjWec2b6R0mOyPaz
>>>
>> 7xtfQ&r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=b9wJHO_k_i
>> jKT84q
>>> 47_fO7MrN-LddnfpVSuNh6ce6Ks&s=CudvGIANC2gl_e-
>> TIAQt2IfpoczlIJIUee9IF78L
>>> GHo&e=
>>>
>>> I could improve the performance by 30% by adding the "DPDK"
>>> based plugin for get_clock() and get_cpu(), Here are the performance
>>> numbers after adding the plugin on
>>> x86 and various arm64 board that I have access to,
>>>
>>> On high-end x86, it comes around 236 cycles/~100ns @ 2.4GHz (See the
>>> last line in the log(ZERO_ARG)) On arm64, it varies from 312 cycles to 1100
>> cycles(based on the class of CPU).
>>> In short, Based on the "IPC capabilities", The cost would be around
>>> 100ns to 400ns for single void trace(a trace without any argument)
>>>
>>>
>>> [lttng-overhead-x86] $ sudo ./calibrate/build/app/calibrate -c 0xc0
>>> make[1]: Entering directory '/export/lttng-overhead-x86/calibrate'
>>> make[1]: Leaving directory '/export/lttng-overhead-x86/calibrate'
>>> EAL: Detected 56 lcore(s)
>>> EAL: Detected 2 NUMA nodes
>>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>>> EAL: Selected IOVA mode 'PA'
>>> EAL: Probing VFIO support...
>>> EAL: PCI device 0000:01:00.0 on NUMA socket 0
>>> EAL:   probe driver: 8086:1521 net_e1000_igb
>>> EAL: PCI device 0000:01:00.1 on NUMA socket 0
>>> EAL:   probe driver: 8086:1521 net_e1000_igb
>>> CPU Timer freq is 2600.000000MHz
>>> NOP: cycles=0.194834 ns=0.074936
>>> GET_CLOCK: cycles=47.854658 ns=18.405638
>>> GET_CPU: cycles=30.995892 ns=11.921497
>>> ZERO_ARG: cycles=236.945113 ns=91.132736
>>>
>>>
>>> We will have only 16.75ns to process 59.2 mpps(40Gbps), So IMO, Lttng
>>> CTF emitter may not fit the DPDK fast path purpose due to the cost
>> associated with generic Lttng features.
>>>
>>> One option could be to have, native CTF emitter in EAL/DPDK to emit
>>> the trace in a hugepage. I think it would be a handful of cycles if we
>>> limit the features to the requirements above:
>>>
>>> The upside of using Lttng CTF emitter:
>>> a) No need to write a new CTF trace emitter(the item (c))
>>>
>>> The downside of Lttng CTF emitter(the item (c))
>>> a) performance issue(See above)
>>> b) Lack of Windows OS support. It looks like, it has basic FreeBSD support.
>>> c) dpdk library dependency to lttng for trace.
>>>
>>> So, Probably it good to have native CTF emitter in DPDK and reuse all
>>> open-source trace viewer(babeltrace and  TraceCompass) and format(CTF)
>> infrastructure.
>>> I think, it would be best of both world.
>>>
>>> Any thoughts on this subject? Based on the community feedback, I can work
>> on the patch for v20.05.
>>>
> 


More information about the dev mailing list