[dpdk-dev] [RFC] DPDK Trace support

dave at barachs.net dave at barachs.net
Sat Jan 18 16:14:31 CET 2020


It would be well worth considering one of the vpp techniques to minimize trace impact:

static inline ring_handler_inline (..., int is_traced)
{
  for (i = 0; i < vector_size; i++)
    {
      if (is_traced)
	{
	  do_trace_work;
	}
      normal_packet_processing;
    }
}

ring_handler (...)
{
  if (PREDICT_FALSE(global_trace_flag != 0))
    return ring_handler_inline (..., 1 /* is_traced */);
  else
    return ring_handler_inline (..., 0 /* is_traced */);
}

This reduces the runtime tax to the absolute minimum, but costs space. 

Please consider it.

HTH... Dave

-----Original Message-----
From: Ray Kinsella <mdr at ashroe.eu> 
Sent: Monday, January 13, 2020 6:00 AM
To: Jerin Jacob Kollanukkaran <jerinj at marvell.com>; dpdk-dev <dev at dpdk.org>; dave at barachs.net
Subject: Re: [RFC] [dpdk-dev] DPDK Trace support

Hi Jerin,

Any idea why lttng performance is so poor?
I would have naturally gone there to benefit from the existing toolchain.

Have you looked at the FD.io logging/tracing infrastructure for inspiration?
https://wiki.fd.io/view/VPP/elog

Ray K

On 13/01/2020 10:40, Jerin Jacob Kollanukkaran wrote:
> Hi All,
> 
> I would like to add tracing support for DPDK.
> I am planning to add this support in v20.05 release.
> 
> This RFC attempts to get feedback from the community on
> 
> a) Tracing Use cases.
> b) Tracing Requirements.
> b) Implementation choices.
> c) Trace format.
> 
> Use-cases
> ---------
> - Most of the cases, The DPDK provider will not have access to the DPDK customer applications.
> To debug/analyze the slow path and fast path DPDK API usage from the 
> field, we need to have integrated trace support in DPDK.
> 
> - Need a low overhead Fast path multi-core PMD driver 
> debugging/analysis infrastructure in DPDK to fix the functional and performance issue(s) of PMD.
> 
> - Post trace analysis tools can provide various status across the 
> system such as cpu_idle() using the timestamp added in the trace.
> 
> 
> Requirements:
> -------------
> - Support for Linux, FreeBSD and Windows OS
> - Open trace format
> - Multi-platform Open source trace viewer
> - Absolute low overhead trace API for DPDK fast path tracing/debugging.
> - Dynamic enable/disable of trace events
> 
> 
> To enable trace support in DPDK, following items need to work out: 
> 
> a) Add the DPDK trace points in the DPDK source code.
> 
> - This includes updating DPDK functions such as, 
> rte_eth_dev_configure(), rte_eth_dev_start(), rte_eth_dev_rx_burst() to emit the trace.
> 
> b) Choosing suitable serialization-format
> 
> - Common Trace Format, CTF, is an open format and language to describe trace formats.
> This enables tool reuse, of which line-textual (babeltrace) and 
> graphical (TraceCompass) variants already exist.
> 
> CTF should look familiar to C programmers but adds stronger typing. 
> See CTF - A Flexible, High-performance Binary Trace Format.
> 
> https://diamon.org/ctf/
> 
> c) Writing the on-target serialization code,
> 
> See the section below.(Lttng CTF trace emitter vs DPDK specific CTF 
> trace emitter)
>  
> d) Deciding on and writing the I/O transport mechanics,
> 
> For performance reasons, it should be backed by a huge-page and write to file IO.
> 
> e) Writing the PC-side deserializer/parser,
> 
> Both the babletrace(CLI tool) and Trace Compass(GUI tool) support CTF.
> See: 
> https://lttng.org/viewers/
> 
> f) Writing tools for filtering and presentation.
> 
> See item (e)
> 
> 
> Lttng CTF trace emitter vs DPDK specific CTF trace emitter
> ----------------------------------------------------------
> 
> I have written a performance evaluation application to measure the 
> overhead of Lttng CTF emitter(The fastpath infrastructure used by 
> https://lttng.org/ library to emit the trace)
> 
> https://github.com/jerinjacobk/lttng-overhead
> https://github.com/jerinjacobk/lttng-overhead/blob/master/README
> 
> I could improve the performance by 30% by adding the "DPDK"
> based plugin for get_clock() and get_cpu(), Here are the performance 
> numbers after adding the plugin on
> x86 and various arm64 board that I have access to,
> 
> On high-end x86, it comes around 236 cycles/~100ns @ 2.4GHz (See the 
> last line in the log(ZERO_ARG)) On arm64, it varies from 312 cycles to 1100 cycles(based on the class of CPU).
> In short, Based on the "IPC capabilities", The cost would be around 
> 100ns to 400ns for single void trace(a trace without any argument)
> 
> 
> [lttng-overhead-x86] $ sudo ./calibrate/build/app/calibrate -c 0xc0
> make[1]: Entering directory '/export/lttng-overhead-x86/calibrate'
> make[1]: Leaving directory '/export/lttng-overhead-x86/calibrate'
> EAL: Detected 56 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: Probing VFIO support...
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:1521 net_e1000_igb
> EAL: PCI device 0000:01:00.1 on NUMA socket 0
> EAL:   probe driver: 8086:1521 net_e1000_igb
> CPU Timer freq is 2600.000000MHz
> NOP: cycles=0.194834 ns=0.074936
> GET_CLOCK: cycles=47.854658 ns=18.405638
> GET_CPU: cycles=30.995892 ns=11.921497
> ZERO_ARG: cycles=236.945113 ns=91.132736
> 
> 
> We will have only 16.75ns to process 59.2 mpps(40Gbps), So IMO, Lttng 
> CTF emitter may not fit the DPDK fast path purpose due to the cost associated with generic Lttng features.
> 
> One option could be to have, native CTF emitter in EAL/DPDK to emit 
> the trace in a hugepage. I think it would be a handful of cycles if we 
> limit the features to the requirements above:
> 
> The upside of using Lttng CTF emitter:
> a) No need to write a new CTF trace emitter(the item (c))
> 
> The downside of Lttng CTF emitter(the item (c))
> a) performance issue(See above)
> b) Lack of Windows OS support. It looks like, it has basic FreeBSD support.
> c) dpdk library dependency to lttng for trace.
> 
> So, Probably it good to have native CTF emitter in DPDK and reuse all 
> open-source trace viewer(babeltrace and  TraceCompass) and format(CTF) infrastructure.
> I think, it would be best of both world.
> 
> Any thoughts on this subject? Based on the community feedback, I can work on the patch for v20.05.
> 



More information about the dev mailing list