[dpdk-dev] [PATCH v3 33/33] doc: add trace library guide

Thomas Monjalon thomas at monjalon.net
Wed Apr 1 16:57:01 CEST 2020
Previous message: [dpdk-dev] [PATCH v2 1/2] vhost: utilize dpdk dynamic memory allocator
Next message: [dpdk-dev] [PATCH v3 33/33] doc: add trace library guide
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,

I did not review the code.
I want to understand the design first,
and I start with the user documentation.

John, Marko, who is going to review this new documentation?
I think we need some help for syntax and phrasing here. Thanks


> +Trace Library
> +=============
> +
> +DPDK provides a tracing library that gives the ability to add tracepoints
> +in application to get runtime trace/debug information for control and fast
> +APIs with minimum impact on fast path performance. Typical trace overhead is
> +~20 cycles and instrumentation overhead is 1 cycle.

Please could you start this doc with explaining what is a tracepoint?

How is it different of a log?

My main question is to understand why it is not implemented as a CTF output
of rte_log. Reminder we have rte_openlog_stream(FILE *).
Is there a way to adapt/extend this API to specify using CTF formatter?
What are the other gaps?

> +
> +Library mainly caters below mentioned use cases:
> +
> +- The DPDK provider will not have access to the DPDK customer applications.

There is no provider/customer.
DPDK is available on dpdk.org and users are not paying.

> +  Inbuilt tracer support will us enable to debug/analyze the slow path and
> +  fast path DPDK API usage.
> +
> +- Provides a low overhead fast path multi-core PMD driver's debugging/analysis
> +  infrastructure to fix the functional and performance issue(s).
> +
> +- Post trace analysis tools can provide various status across the system such
> +  as cpu_idle() using the timestamp added in the trace.
> +
> +Below sections will provide detailed information about:
> +
> + - Trace a user application
> + - View and analyze the recorded events
> +
> +Trace a user application
> +------------------------
> +
> +This section steps you through a simple example to trace an application.
> +A trace can be achieved using below mentioned steps:
> +
> +Define and register a tracepoint
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +The application can define and register tracepoints either existing C file or
> +create a new file (say xyz_app_trace_point.c). Also, all the tracepoints must be
> +resolved before rte_eal_init i.e. tracepoints must be registered as constructor
> +using RTE_INIT interface.

I think you want to explain the design before,
so usage constraints will be better understood.

> +
> +Following are the MACRO definition exposed by the trace Library to define and
> +register a tracepoint.
> +
> +.. code-block:: c
> +
> + #define RTE_TRACE_POINT_DEFINE(tp)\
> +        uint64_t __attribute__((section("__rte_trace_point"))) __##tp
> +
> + #define RTE_TRACE_POINT_REGISTER(trace, name, level)\
> +       __rte_trace_point_register(&__##trace, RTE_STR(name), RTE_LOG_ ## level,\
> +                                  (void (*)(void)) trace)

Showing the macro definitions does not help to understand
neither the concept of define and register, nor the usage.

> +Example tracepoint definition and registration
> +
> +.. code-block:: c
> +
> + RTE_TRACE_POINT_DEFINE(rte_trace_lib_eal_generic_str); /* Definition */

What is a definition for?
Please explain a very simple use case by stating the goal and the output
of the example clearly.

> +
> + RTE_INIT(eal_trace_init)
> + {
> +     /* Registration */
> +     RTE_TRACE_POINT_REGISTER(rte_trace_lib_eal_generic_str,
> +                              lib.eal.generic.str, INFO);
> + }

Here we just see some function calls without understanding what it does.

> +
> +For more details refer trace API documentation.

Yes the details must be in the API documentation.
This guide should explain the design and goals.

> +Defined tracepoint must be exported into corresponding .map file.
> +
> +.. Note::
> +
> +    A tracepoint is defined like __##tp i.e. __rte_trace_lib_eal_generic_str
> +    for above example. Same must be updated into corresponding .map file.

We don't know what is __##tp. That's so cryptic :-)

> +
> +Define trace function to write events
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +After a successful tracepoint registration, the application must define a
> +trace function which solves three purposes:
> +
> + - Calculates the size of the event.
> + - Generate CTF metadata field string for the event.

The format of the trace is the responsibility of the application?

> + - Emit the event to trace memory.
> +
> +A tracepoint can be classified as either a data path or a slow path tracepoint.
> +So based on that, the application must define tracepoint function using one of
> +the mentioned MACRO
> +
> +.. code-block:: c
> +
> + /* Define tracepoint function for slow path */
> + #define RTE_TRACE_POINT(tp, args, ...)\
> +        __RTE_TRACE_POINT(generic, tp, args, __VA_ARGS__)
> +
> + /* Define tracepoint function for data path */
> + #define RTE_TRACE_POINT_DP(tp, args, ...)\
> +        __RTE_TRACE_POINT(dp, tp, args, __VA_ARGS__)

No need to copy/paste the code (which will be outdated at some point).
I think naming the macros is enough.

> +RTE_TRACE_POINT_DP is compiled out by default and can be enabled using
> +CONFIG_RTE_ENABLE_TRACE_DP configuration parameter. Also application can use
> +``rte_trace_is_dp_enabled`` to get current status of RTE_TRACE_POINT_DP.
> +For more details, refer DPDK Trace API documentation.
> +
> +Example tracepoint function definition
> +
> +.. code-block:: c
> +
> + /* Slow path tracepoint */
> + RTE_TRACE_POINT(
> +        rte_trace_lib_eal_generic_str,
> +        RTE_TRACE_POINT_ARGS(const char *str),
> +        rte_trace_ctf_string(str);
> + )
> +
> + /* Data path tracepoint */
> + RTE_TRACE_POINT_DP(
> +        rte_trace_lib_eal_generic_str,
> +        RTE_TRACE_POINT_ARGS(const char *str),
> +        rte_trace_ctf_string(str);
> + )
> +
> +Emit events to trace memory
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +After trace function definition is ready to emit tracepoints.
> +To emit the event application needs to invoke tracepoint function, as defined
> +in the above steps, at the desired location.
> +
> +Below examples emit tracepoints in ``rte_eth_dev_configure`` to print a test
> +string:
> +
> +.. code-block:: c
> +
> + int
> + rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
> +        const struct rte_eth_conf *dev_conf)
> + {
> +        struct rte_eth_dev *dev;
> +        struct rte_eth_dev_info dev_info;
> +        struct rte_eth_conf orig_conf;
> +        int diag;
> +        int ret;
> +
> +        RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> +
> +        dev = &rte_eth_devices[port_id];
> +
> +        RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_configure, -ENOTSUP);
> +
> +        ...
> +
> +        rte_trace_lib_eal_generic_str("tp_test_string");
> +        return ret;
> + }

Please do not copy real code. Hello-world style is enough.
We do not want to be disturbed by the context of the code in such guide.

> +
> +Generate CTF formatted metadata
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +As of now emitted events just specify the debug information written by the
> +application but to view/analyze these events must be formatted into Common Trace
> +Format(CTF) so that any CTF compliant trace analysis tool can view those traces.
> +
> +Trace library exposes below API to write events to CTF formatted metadata file.
> +
> +.. code-block:: c
> +
> + int rte_trace_save(void);
> +
> +Currently library invokes this API implicitly during tear down and metadata file
> +is generated at either ``/root/dpdk-traces/rte-yyyy-mm-dd-[AP]M-hh-mm-ss/`` or
> +at location if user has passed during command line(``say /tmp``) then
> +``/tmp/rte-yyyy-mm-dd-[AP]M-hh-mm-ss/``

The trace is written in a file when the application is closing?
How is it working if application crashes?
How can we debug the application in real-time?

> +
> +For more information, refer :doc:`../linux_gsg/linux_eal_parameters` for trace.
> +
> +View and analyze the recorded events
> +------------------------------------
> +Once ``Trace a user application`` is completed, the user can view/inspect the
> +recorded events.
> +
> +There are many tools you can use to read DPDK traces:
> +
> + - ``babeltrace`` is a command-line utility that converts trace formats; it
> +   supports the format that DPDK trace library produces, CTF, as well as a
> +   basic text output that can be grep ed. The babeltrace command is part of the
> +   opensource ``Babeltrace`` project.
> +
> + - ``Trace Compass`` is a graphical user interface for viewing and analyzing any
> +   type of logs or traces, including DPDK traces.
> +
> +.. Note::
> +
> +   This section assumes that the trace library saved the traces, it recorded
> +   during the previous tutorials, to their specified location.
> +
> +
> +Use the babeltrace command-line tool
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +The simplest way to list all the recorded events of a trace is to pass its path
> +to babeltrace with no options::
> +
> +    babeltrace </path-to-trace-events/rte-yyyy-mm-dd-[AP]M-hh-mm-ss/>
> +
> +``babeltrace`` finds all traces recursively within the given path and prints all
> +their events, merging them in chronological order.
> +
> +You can pipe the output of the babeltrace into a tool like grep(1) for further
> +filtering. Below example grep the events for ``ethdev`` only::
> +
> +    babeltrace /tmp/my-dpdk-trace | grep ethdev
> +
> +You can pipe the output of babeltrace into a tool like wc(1) to count the
> +recorded events. Below example count the number of ``ethdev`` events::
> +
> +    babeltrace /tmp/my-dpdk-trace | grep ethdev | wc --lines
> +
> +Use the tracecompass GUI tool
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +``Tracecompass`` is another tool to view/analyze the DPDK traces which gives
> +a graphical view of events. Like ``babeltrace``, tracecompass also provides
> +an interface to search for a particular event. To use ``tracecompass``, following are
> +the minimum required steps:
> +
> + - Install ``tracecompass`` to the localhost. Variants are available for Linux,
> +   Windows, and OS-X.
> + - Launch ``tracecompass`` which will open a graphical window with trace
> +   management interfaces.
> + - Open a trace using ``File->Open Trace`` option and select metadata file
> +   which is to be viewed/analyzed.
> +
> +For more details, refer `Trace Compass <https://www.eclipse.org/tracecompass/>`_

I don't think we need to explain external tools here. They can evolve.
Giving their names and URL is enough.
Maybe just with a short description (text vs graphical).

> +
> +Core Concepts
> +-------------

I think the concepts must be explained at the beginning.

> +As DPDK trace library is designed to generate traces that uses Common Trace
> +Format(CTF). CTF specification consist of following units to create a trace.
> +
> + - ``Stream`` Sequence of packets.
> + - ``Packet`` Header and one or more events.
> + - ``Event`` Header and payload.
> +
> +For detailed information, refer `Common Trace Format <https://diamon.org/ctf/>`_
> +
> +Channel and trace memory
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +A channel is an object which is responsible for holding the trace memory.
> +The trace library creates the trace memory per thread to enable the lock-less
> +scheme to emit the event. When a DPDK tracer emits an event, it will be recorded
> +to the trace buffers that associated with that thread.
> +
> +Event record mode
> +~~~~~~~~~~~~~~~~~
> +Event record mode is an attribute of trace buffers. Trace library exposes two
> +modes:
> +
> + - ``Overwrite`` This mode enables trace buffers to wrap around when trace buffer memory is full.
> + - ``Discard`` This mode enables trace buffers to discard when trace buffer memory is full.
> +
> +This mode can be enabled/disabled either using eal command line parameters or
> +DPDK trace library API to configure the mode.
> +Refer :doc:`../linux_gsg/linux_eal_parameters` and trace API documentation more
> +details.
> +
> +Metadata
> +~~~~~~~~
> +Metadata defines the layout of event records so that trace analysis tool can
> +read the streams and show into the relevant format.

I don't understand metadata from the explanation above.
Is it useful to understand when using this library?
What is the impact for the user?

> +For more details, refer `Common Trace Format <https://diamon.org/ctf/>`_.


Thanks for the work.
Please let's improve the doc while discussing the design.
Previous message: [dpdk-dev] [PATCH v2 1/2] vhost: utilize dpdk dynamic memory allocator
Next message: [dpdk-dev] [PATCH v3 33/33] doc: add trace library guide
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list