[PATCH v5 0/4] net/mlx5: introduce Tx datapath tracing
Raslan Darawsheh
rasland at nvidia.com
Thu Jul 6 18:27:41 CEST 2023
Hi,
> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo at nvidia.com>
> Sent: Wednesday, July 5, 2023 6:31 PM
> To: dev at dpdk.org
> Cc: jerinj at marvell.com; Raslan Darawsheh <rasland at nvidia.com>
> Subject: [PATCH v5 0/4] net/mlx5: introduce Tx datapath tracing
>
> The mlx5 provides the send scheduling on specific moment of time, and for
> the related kind of applications it would be extremely useful to have extra
> debug information - when and how packets were scheduled and when the
> actual sending was completed by the NIC hardware (it helps application to
> track the internal delay issues).
>
> Because the DPDK tx datapath API does not suppose getting any feedback
> from the driver and the feature looks like to be mlx5 specific, it seems to be
> reasonable to engage exisiting DPDK datapath tracing capability.
>
> The work cycle is supposed to be:
> - compile appplication with enabled tracing
> - run application with EAL parameters configuring the tracing in mlx5
> Tx datapath
> - store the dump file with gathered tracing information
> - run analyzing scrypt (in Python) to combine related events (packet
> firing and completion) and see the data in human-readable view
>
> Below is the detailed instruction "how to" with mlx5 NIC to gather all the
> debug data including the full timings information.
>
>
> 1. Build DPDK application with enabled datapath tracing
>
> The meson option should be specified:
> --enable_trace_fp=true
>
> The c_args shoudl be specified:
> -DALLOW_EXPERIMENTAL_API
>
> The DPDK configuration examples:
>
> meson configure --buildtype=debug -Denable_trace_fp=true
> -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT -
> DALLOW_EXPERIMENTAL_API' build
>
> meson configure --buildtype=debug -Denable_trace_fp=true
> -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
>
> meson configure --buildtype=release -Denable_trace_fp=true
> -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
>
> meson configure --buildtype=release -Denable_trace_fp=true
> -Dc_args='-DALLOW_EXPERIMENTAL_API' build
>
>
> 2. Configuring the NIC
>
> If the sending completion timings are important the NIC should be configured
> to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings
> parameter should be configured to TRUE, for example with command (and
> with following FW/driver reset):
>
> sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s
> REAL_TIME_CLOCK_ENABLE=1
>
>
> 3. Run DPDK application to gather the traces
>
> EAL parameters controlling trace capability in runtime
>
> --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints
> with matching names at least "pmd.net.mlx5.tx"
> must be enabled to gather all events needed
> to analyze mlx5 Tx datapath and its timings.
> By default all tracepoints are disabled.
>
> --trace-dir=/var/log - trace storing directory
>
> --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size
> per thread. The default is 1MB.
>
> --trace-mode=overwrite|discard - optional, selects trace data buffer mode.
>
>
> 4. Installing or Building Babeltrace2 Package
>
> The gathered trace data can be analyzed with a developed Python script.
> To parse the trace, the data script uses the Babeltrace2 library.
> The package should be either installed or built from source code as shown
> below:
>
> git clone https://github.com/efficios/babeltrace.git
> cd babeltrace
> ./bootstrap
> ./configure -help
> ./configure --disable-api-doc --disable-man-pages
> --disable-python-bindings-doc --enbale-python-plugins
> --enable-python-binding
>
> 5. Running the Analyzing Script
>
> The analyzing script is located in the folder: ./drivers/net/mlx5/tools It requires
> Python3.6, Babeltrace2 packages and it takes the only parameter of trace data
> file. For example:
>
> ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39
>
>
> 6. Interpreting the Script Output Data
>
> All the timings are given in nanoseconds.
> The list of Tx (and coming Rx) bursts per port/queue is presented in the
> output.
> Each list element contains the list of built WQEs with specific opcodes, and
> each WQE contains the list of the encompassed packets to send.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
>
> --
> v2: - comment addressed: "dump_trace" command is replaced with
> "save_trace"
> - Windows build failure addressed, Windows does not support tracing
>
> v3: - tracepoint routines are moved to the net folder, no need to export
> - documentation added
> - testpmd patches moved out from series to the dedicated patches
>
> v4: - Python comments addressed
> - codestyle issues fixed
>
> v5: - traces are moved to the dedicated files, otherwise registration
> header caused wrong code generation for 3rd party files/objects
> and resulted in performance drop
>
> Viacheslav Ovsiienko (4):
> net/mlx5: introduce tracepoints for mlx5 drivers
> net/mlx5: add comprehensive send completion trace
> net/mlx5: add Tx datapath trace analyzing script
> doc: add mlx5 datapath tracing feature description
>
> doc/guides/nics/mlx5.rst | 78 +++++++
> drivers/net/mlx5/linux/mlx5_verbs.c | 8 +-
> drivers/net/mlx5/meson.build | 1 +
> drivers/net/mlx5/mlx5_devx.c | 8 +-
> drivers/net/mlx5/mlx5_rx.h | 19 --
> drivers/net/mlx5/mlx5_rxtx.h | 19 ++
> drivers/net/mlx5/mlx5_trace.c | 25 +++
> drivers/net/mlx5/mlx5_trace.h | 73 +++++++
> drivers/net/mlx5/mlx5_tx.c | 9 +
> drivers/net/mlx5/mlx5_tx.h | 89 +++++++-
> drivers/net/mlx5/tools/mlx5_trace.py | 307
> +++++++++++++++++++++++++++
> 11 files changed, 607 insertions(+), 29 deletions(-) create mode 100644
> drivers/net/mlx5/mlx5_trace.c create mode 100644
> drivers/net/mlx5/mlx5_trace.h create mode 100755
> drivers/net/mlx5/tools/mlx5_trace.py
>
> --
> 2.18.1
Applied first two patches to next-net-mlx,
Script + doc will be considered for RC4
Kindest regards,
Raslan Darawsheh
More information about the dev
mailing list