[PATCH v6 2/4] examples/ptp_tap_relay_sw: add PTP software transparent clock relay
Rajesh Kumar
rajesh3.kumar at intel.com
Thu May 7 12:13:12 CEST 2026
Add a new example application demonstrating a software PTP Transparent
Clock relay between a DPDK-bound physical NIC and a Linux kernel TAP
virtual interface.
The relay uses software timestamps (CLOCK_MONOTONIC) to measure residence
time and accumulates it into the PTP correctionField per IEEE 1588-2019
§10.2, enabling synchronized time distribution via standard linuxptp
(ptp4l) on both sides.
Features:
- Handles L2, VLAN/QinQ, and UDP/IPv4/IPv6 PTP encapsulations
- Supports PTP v2 event messages (Sync, Delay_Req, PDelay_Req, PDelay_Resp)
- Two-pass burst processing: classify then timestamp immediately before TX
- Unmodified Linux kernel and stock DPDK (no kernel patches required)
- Bidirectional relay: PHY ↔ TAP
Includes:
- ptp_tap_relay_sw.c: Main relay logic with burst processing
- ptp_parse.h: Local DPI parser for PTP classification (not a library API)
- Sample app guide with topology, command-line options, and example output
Uses lib/net/rte_ptp.h inline helpers for correctionField manipulation
and header parsing.
Signed-off-by: Rajesh Kumar <rajesh3.kumar at intel.com>
---
doc/guides/sample_app_ug/ptp_tap_relay_sw.rst | 212 +++++++++
examples/ptp_tap_relay_sw/Makefile | 41 ++
examples/ptp_tap_relay_sw/meson.build | 13 +
examples/ptp_tap_relay_sw/ptp_parse.h | 211 +++++++++
examples/ptp_tap_relay_sw/ptp_tap_relay_sw.c | 432 ++++++++++++++++++
5 files changed, 909 insertions(+)
create mode 100644 doc/guides/sample_app_ug/ptp_tap_relay_sw.rst
create mode 100644 examples/ptp_tap_relay_sw/Makefile
create mode 100644 examples/ptp_tap_relay_sw/meson.build
create mode 100644 examples/ptp_tap_relay_sw/ptp_parse.h
create mode 100644 examples/ptp_tap_relay_sw/ptp_tap_relay_sw.c
diff --git a/doc/guides/sample_app_ug/ptp_tap_relay_sw.rst b/doc/guides/sample_app_ug/ptp_tap_relay_sw.rst
new file mode 100644
index 0000000000..15727383c1
--- /dev/null
+++ b/doc/guides/sample_app_ug/ptp_tap_relay_sw.rst
@@ -0,0 +1,212 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2026 Intel Corporation.
+
+PTP Software Relay Sample Application
+======================================
+
+The PTP Software Relay sample application demonstrates how to build a
+minimal PTP Transparent Clock relay between a DPDK-bound physical NIC
+and a kernel TAP interface using **software timestamps only**. It uses
+the PTP definitions from ``rte_ptp.h`` (in ``lib/net/``) together with a
+local packet parser.
+
+The application works with an unmodified Linux kernel and stock DPDK.
+
+For background on PTP see:
+`Precision Time Protocol
+<https://en.wikipedia.org/wiki/Precision_Time_Protocol>`_.
+
+
+Limitations
+-----------
+
+* Tested with L2 PTP (EtherType 0x88F7) on the wire.
+ The local parser also classifies VLAN/QinQ and UDP/IPv4/IPv6.
+* Only PTP v2 messages are processed.
+* Software timestamps have microsecond-class jitter; sub-microsecond
+ precision depends on system load and NIC-to-TAP forwarding latency.
+* The PTP time transmitter must be reachable on the physical NIC's L2 network.
+* Only one physical port and one TAP port are supported.
+
+
+How the Application Works
+-------------------------
+
+Topology
+~~~~~~~~
+
+::
+
+ PTP Time Transmitter Physical NIC TAP (kernel)
+ (ptp4l -H) ──L2── (DPDK vfio-pci) ────── dtap0
+ │ │
+ ptp_tap_relay_sw ptp4l -S
+ (correctionField += (SW timestamps,
+ residence time) adjusts CLOCK_REALTIME)
+
+The relay sits between a DPDK-owned physical NIC and a kernel TAP
+virtual interface. ``ptp4l`` runs on the TAP interface in software
+timestamp mode (``-S``) as a PTP time receiver.
+
+Packet Flow
+~~~~~~~~~~~
+
+1. The physical NIC receives PTP (and non-PTP) packets via DPDK RX.
+2. A software RX timestamp is recorded using
+ ``clock_gettime(CLOCK_MONOTONIC)``.
+3. Each packet is parsed to locate the PTP header.
+4. For PTP **event** messages (Sync, Delay_Req, PDelay_Req, PDelay_Resp),
+ a TX software timestamp is taken just before transmission.
+5. The residence time (``tx_ts − rx_ts``) is added to the PTP
+ ``correctionField`` via ``rte_ptp_add_correction()`` — standard
+ IEEE 1588-2019 Transparent Clock behaviour (§10.2).
+6. Packets are forwarded bidirectionally:
+
+ * PHY → TAP (network → ptp4l)
+ * TAP → PHY (ptp4l → network)
+
+A two-pass design is used: first all packets are classified and PTP
+header pointers saved, then a single TX timestamp is taken immediately
+before applying corrections and calling ``rte_eth_tx_burst()``.
+This minimises the gap between the measured timestamp and the actual
+wire egress.
+
+
+Compiling the Application
+-------------------------
+
+To compile the sample application see :doc:`compiling`.
+
+The application is located in the ``ptp_tap_relay_sw`` sub-directory.
+
+.. note::
+
+ The application uses ``rte_ptp.h`` from ``lib/net/`` (built by default)
+ and a local ``ptp_parse.h`` header for packet classification.
+
+
+Running the Application
+-----------------------
+
+Prerequisites
+~~~~~~~~~~~~~
+
+* A PTP-capable physical NIC bound to DPDK (e.g. via ``vfio-pci``).
+* ``linuxptp`` (``ptp4l``) installed on the system.
+* A PTP time transmitter reachable on the same L2 network.
+
+Start the relay
+~~~~~~~~~~~~~~~~
+
+.. code-block:: console
+
+ ./<build_dir>/examples/dpdk-ptp_tap_relay_sw \
+ -l 18-19 -a 0000:cc:00.1 --vdev=net_tap0,iface=dtap0 -- \
+ -p 0 -t 1 -T 10
+
+Command-line Options
+~~~~~~~~~~~~~~~~~~~~
+
+* ``-p PORT`` — Physical NIC port ID (default: 0).
+* ``-t PORT`` — TAP port ID (default: 1).
+* ``-T SECS`` — Statistics print interval in seconds (default: 10).
+
+Start PTP time transmitter
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+On a separate terminal or remote host, start ``ptp4l`` as time
+transmitter with hardware timestamps on the physical NIC:
+
+.. code-block:: console
+
+ ptp4l -i <iface> -m -2 -H --serverOnly=1 \
+ --logSyncInterval=-4 --logMinDelayReqInterval=-4
+
+Start PTP time receiver
+~~~~~~~~~~~~~~~~~~~~~~~
+
+On the TAP interface, start ``ptp4l`` in software timestamp mode:
+
+.. code-block:: console
+
+ ptp4l -i dtap0 -m -2 -s -S \
+ --delay_filter=moving_median --delay_filter_length=10
+
+The time receiver will enter UNCALIBRATED state for approximately 60
+seconds while the PI servo estimates the frequency offset, then step
+the clock and enter time-receiver (synchronized) state.
+Steady-state RMS offset of 500–1000 ns is typical on a lightly loaded
+system with a hardware-timestamped time transmitter.
+
+Example Output
+~~~~~~~~~~~~~~
+
+Relay statistics printed every ``-T`` seconds:
+
+::
+
+ [PTP-SW] === Statistics ===
+ [PTP-SW] PHY RX total: 5646
+ [PTP-SW] PHY RX PTP: 5598
+ [PTP-SW] TAP TX: 5646
+ [PTP-SW] TAP RX total: 1800
+ [PTP-SW] TAP RX PTP: 1788
+ [PTP-SW] PHY TX: 1800
+ [PTP-SW] Corrections: 3635
+
+Time receiver ``ptp4l`` output after convergence:
+
+::
+
+ ptp4l[451534.520]: rms 630 max 1166 freq -44365 +/- 100 delay 37668 +/- 71
+ ptp4l[451539.525]: rms 602 max 1177 freq -44339 +/- 119 delay 37517 +/- 43
+ ptp4l[451544.530]: rms 535 max 1194 freq -44345 +/- 103 delay 37410 +/- 81
+
+
+Code Explanation
+----------------
+
+The following sections explain the main components of the application.
+
+Relay Burst Function
+~~~~~~~~~~~~~~~~~~~~
+
+The core relay logic is in ``relay_burst()``, which handles one direction
+(PHY→TAP or TAP→PHY) per call:
+
+**Pass 1 — Classify:**
+
+For each received packet, ``ptp_hdr_find()`` locates the PTP header
+(if present). For event messages, the header pointer is saved for the
+second pass.
+
+**Pass 2 — Timestamp and correct:**
+
+A single software TX timestamp is taken via
+``clock_gettime(CLOCK_MONOTONIC)``. The residence time
+(``tx_ts − rx_ts``) is added to each saved PTP header's
+``correctionField`` using ``rte_ptp_add_correction()``.
+The burst is then transmitted with ``rte_eth_tx_burst()``.
+
+Main Loop
+~~~~~~~~~
+
+The ``relay_loop()`` function polls both directions in a tight loop:
+
+.. code-block:: c
+
+ while (!force_quit) {
+ relay_burst(phy_port, tap_port, ...); /* PHY → TAP */
+ relay_burst(tap_port, phy_port, ...); /* TAP → PHY */
+ }
+
+Statistics are printed at the interval specified by ``-T``.
+
+Timestamp Source
+~~~~~~~~~~~~~~~~
+
+``CLOCK_MONOTONIC`` is used rather than ``CLOCK_REALTIME`` because
+the PTP time receiver's servo continuously adjusts ``CLOCK_REALTIME``.
+Using ``CLOCK_REALTIME`` would corrupt residence time measurements
+during clock stepping or frequency slewing. ``CLOCK_MONOTONIC`` is
+portable across Linux and FreeBSD.
diff --git a/examples/ptp_tap_relay_sw/Makefile b/examples/ptp_tap_relay_sw/Makefile
new file mode 100644
index 0000000000..fd178f46ae
--- /dev/null
+++ b/examples/ptp_tap_relay_sw/Makefile
@@ -0,0 +1,41 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Intel Corporation
+
+# binary name
+APP = dpdk-ptp_tap_relay_sw
+
+# all source are stored in SRCS-y
+SRCS-y := ptp_tap_relay_sw.c
+
+PKGCONF ?= pkg-config
+
+# Build using pkg-config variables if possible
+ifneq ($(shell $(PKGCONF) --exists libdpdk && echo 0),0)
+$(error "no installation of DPDK found")
+endif
+
+all: shared
+.PHONY: shared static
+shared: build/$(APP)-shared
+ ln -sf $(APP)-shared build/$(APP)
+static: build/$(APP)-static
+ ln -sf $(APP)-static build/$(APP)
+
+PC_FILE := $(shell $(PKGCONF) --path libdpdk 2>/dev/null)
+CFLAGS += -O3 $(shell $(PKGCONF) --cflags libdpdk)
+LDFLAGS_SHARED = $(shell $(PKGCONF) --libs libdpdk)
+LDFLAGS_STATIC = $(shell $(PKGCONF) --static --libs libdpdk)
+
+build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build
+ $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED)
+
+build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build
+ $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC)
+
+build:
+ @mkdir -p $@
+
+.PHONY: clean
+clean:
+ rm -f build/$(APP) build/$(APP)-static build/$(APP)-shared
+ test -d build && rmdir -p build || true
diff --git a/examples/ptp_tap_relay_sw/meson.build b/examples/ptp_tap_relay_sw/meson.build
new file mode 100644
index 0000000000..34a4d86439
--- /dev/null
+++ b/examples/ptp_tap_relay_sw/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Intel Corporation
+
+# meson file, for building this example as part of a main DPDK build.
+#
+# To build this example as a standalone application with an already-installed
+# DPDK instance, use 'make'
+
+sources = files(
+ 'ptp_tap_relay_sw.c',
+)
+deps += ['net']
+cflags += no_shadow_cflag
diff --git a/examples/ptp_tap_relay_sw/ptp_parse.h b/examples/ptp_tap_relay_sw/ptp_parse.h
new file mode 100644
index 0000000000..db0dcfe5c1
--- /dev/null
+++ b/examples/ptp_tap_relay_sw/ptp_parse.h
@@ -0,0 +1,211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Intel Corporation
+ *
+ * PTP packet parser — locates PTP headers through L2, VLAN, and UDP
+ * encapsulations. This is a DPI helper for use within example
+ * applications; it does not belong in the core library.
+ */
+
+#ifndef _PTP_PARSE_H_
+#define _PTP_PARSE_H_
+
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_udp.h>
+#include <rte_ptp.h>
+
+/** Not a PTP packet. */
+#define PTP_MSGTYPE_INVALID (-1)
+
+/**
+ * Locate the PTP header within a packet.
+ *
+ * Handles L2 (EtherType 0x88F7), VLAN-tagged L2 (single/double,
+ * TPIDs 0x8100/0x88A8), PTP over UDP/IPv4, PTP over UDP/IPv6,
+ * and VLAN-tagged UDP variants.
+ *
+ * @param m
+ * Pointer to the mbuf.
+ * @return
+ * Pointer to the PTP header, or NULL if not a PTP packet.
+ */
+static inline struct rte_ptp_hdr *
+ptp_hdr_find(const struct rte_mbuf *m)
+{
+ const struct rte_ether_hdr *eth;
+ uint16_t ether_type;
+ uint32_t offset;
+
+ if (rte_pktmbuf_data_len(m) < sizeof(struct rte_ether_hdr))
+ return NULL;
+
+ eth = rte_pktmbuf_mtod(m, const struct rte_ether_hdr *);
+ ether_type = rte_be_to_cpu_16(eth->ether_type);
+ offset = sizeof(struct rte_ether_hdr);
+
+ /* Strip VLAN / QinQ tags */
+ if (ether_type == RTE_ETHER_TYPE_VLAN ||
+ ether_type == RTE_ETHER_TYPE_QINQ) {
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_vlan_hdr))
+ return NULL;
+ const struct rte_vlan_hdr *vlan =
+ rte_pktmbuf_mtod_offset(m,
+ const struct rte_vlan_hdr *, offset);
+ ether_type = rte_be_to_cpu_16(vlan->eth_proto);
+ offset += sizeof(struct rte_vlan_hdr);
+
+ /* Second tag (QinQ inner or stacked VLAN) */
+ if (ether_type == RTE_ETHER_TYPE_VLAN ||
+ ether_type == RTE_ETHER_TYPE_QINQ) {
+ if (rte_pktmbuf_data_len(m) <
+ offset + sizeof(struct rte_vlan_hdr))
+ return NULL;
+ vlan = rte_pktmbuf_mtod_offset(m,
+ const struct rte_vlan_hdr *, offset);
+ ether_type = rte_be_to_cpu_16(vlan->eth_proto);
+ offset += sizeof(struct rte_vlan_hdr);
+ }
+ }
+
+ /* L2 PTP: EtherType 0x88F7 */
+ if (ether_type == RTE_ETHER_TYPE_1588) {
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_ptp_hdr))
+ return NULL;
+ return rte_pktmbuf_mtod_offset(m,
+ struct rte_ptp_hdr *, offset);
+ }
+
+ /* PTP over UDP/IPv4 */
+ if (ether_type == RTE_ETHER_TYPE_IPV4) {
+ const struct rte_ipv4_hdr *iph;
+ uint16_t ihl;
+
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_ipv4_hdr))
+ return NULL;
+
+ iph = rte_pktmbuf_mtod_offset(m,
+ const struct rte_ipv4_hdr *, offset);
+ if (iph->next_proto_id != IPPROTO_UDP)
+ return NULL;
+
+ ihl = (iph->version_ihl & 0x0F) * 4;
+ if (ihl < 20)
+ return NULL;
+ offset += ihl;
+
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_udp_hdr))
+ return NULL;
+
+ const struct rte_udp_hdr *udp =
+ rte_pktmbuf_mtod_offset(m,
+ const struct rte_udp_hdr *, offset);
+ uint16_t dst_port = rte_be_to_cpu_16(udp->dst_port);
+
+ if (dst_port != RTE_PTP_EVENT_PORT &&
+ dst_port != RTE_PTP_GENERAL_PORT)
+ return NULL;
+
+ offset += sizeof(struct rte_udp_hdr);
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_ptp_hdr))
+ return NULL;
+
+ return rte_pktmbuf_mtod_offset(m,
+ struct rte_ptp_hdr *, offset);
+ }
+
+ /* PTP over UDP/IPv6 */
+ if (ether_type == RTE_ETHER_TYPE_IPV6) {
+ const struct rte_ipv6_hdr *ip6h;
+
+ if (rte_pktmbuf_data_len(m) <
+ offset + sizeof(struct rte_ipv6_hdr))
+ return NULL;
+
+ ip6h = rte_pktmbuf_mtod_offset(m,
+ const struct rte_ipv6_hdr *, offset);
+ if (ip6h->proto != IPPROTO_UDP)
+ return NULL;
+
+ offset += sizeof(struct rte_ipv6_hdr);
+
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_udp_hdr))
+ return NULL;
+
+ const struct rte_udp_hdr *udp =
+ rte_pktmbuf_mtod_offset(m,
+ const struct rte_udp_hdr *, offset);
+ uint16_t dst_port = rte_be_to_cpu_16(udp->dst_port);
+
+ if (dst_port != RTE_PTP_EVENT_PORT &&
+ dst_port != RTE_PTP_GENERAL_PORT)
+ return NULL;
+
+ offset += sizeof(struct rte_udp_hdr);
+ if (rte_pktmbuf_data_len(m) < offset + sizeof(struct rte_ptp_hdr))
+ return NULL;
+
+ return rte_pktmbuf_mtod_offset(m,
+ struct rte_ptp_hdr *, offset);
+ }
+
+ return NULL;
+}
+
+/**
+ * Classify a packet as PTP and return the message type.
+ *
+ * @param m
+ * Pointer to the mbuf to classify.
+ * @return
+ * PTP message type (0x0-0xF) on success, PTP_MSGTYPE_INVALID (-1)
+ * if the packet is not PTP.
+ */
+static inline int
+ptp_classify(const struct rte_mbuf *m)
+{
+ struct rte_ptp_hdr *hdr = ptp_hdr_find(m);
+
+ if (hdr == NULL)
+ return PTP_MSGTYPE_INVALID;
+
+ return rte_ptp_msg_type(hdr);
+}
+
+/** PTP message type name table. */
+static const char * const ptp_msg_names[] = {
+ [RTE_PTP_MSGTYPE_SYNC] = "Sync",
+ [RTE_PTP_MSGTYPE_DELAY_REQ] = "Delay_Req",
+ [RTE_PTP_MSGTYPE_PDELAY_REQ] = "PDelay_Req",
+ [RTE_PTP_MSGTYPE_PDELAY_RESP] = "PDelay_Resp",
+ [0x4] = "Reserved_4",
+ [0x5] = "Reserved_5",
+ [0x6] = "Reserved_6",
+ [0x7] = "Reserved_7",
+ [RTE_PTP_MSGTYPE_FOLLOW_UP] = "Follow_Up",
+ [RTE_PTP_MSGTYPE_DELAY_RESP] = "Delay_Resp",
+ [RTE_PTP_MSGTYPE_PDELAY_RESP_FU] = "PDelay_Resp_Follow_Up",
+ [RTE_PTP_MSGTYPE_ANNOUNCE] = "Announce",
+ [RTE_PTP_MSGTYPE_SIGNALING] = "Signaling",
+ [RTE_PTP_MSGTYPE_MANAGEMENT] = "Management",
+ [0xE] = "Reserved_E",
+ [0xF] = "Reserved_F",
+};
+
+/**
+ * Get a human-readable name for a PTP message type.
+ *
+ * @param msg_type
+ * PTP message type (0x0-0xF or PTP_MSGTYPE_INVALID).
+ * @return
+ * Static string with the message type name.
+ */
+static inline const char *
+ptp_msg_type_str(int msg_type)
+{
+ if (msg_type < 0 || msg_type > 0xF)
+ return "Not_PTP";
+ return ptp_msg_names[msg_type];
+}
+
+#endif /* _PTP_PARSE_H_ */
diff --git a/examples/ptp_tap_relay_sw/ptp_tap_relay_sw.c b/examples/ptp_tap_relay_sw/ptp_tap_relay_sw.c
new file mode 100644
index 0000000000..998df2ac3b
--- /dev/null
+++ b/examples/ptp_tap_relay_sw/ptp_tap_relay_sw.c
@@ -0,0 +1,432 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Intel Corporation
+ */
+
+/*
+ * PTP Software Relay
+ *
+ * A minimal PTP relay between a DPDK-bound physical NIC and a kernel
+ * TAP interface using software timestamps only.
+ *
+ * How it works:
+ * 1. Physical NIC receives PTP (and non-PTP) packets via DPDK RX.
+ * 2. For PTP event messages (Sync, Delay_Req, PDelay_Req, PDelay_Resp)
+ * the relay records an RX software timestamp (clock_gettime).
+ * 3. Just before TX on the other side it records a TX software timestamp.
+ * 4. The relay residence time (tx_ts − rx_ts) is added to the PTP
+ * correctionField via rte_ptp_add_correction() — standard
+ * Transparent Clock behaviour (IEEE 1588-2019 §10.2).
+ * 5. Packets are forwarded bi-directionally:
+ * PHY → TAP (network → ptp4l)
+ * TAP → PHY (ptp4l → network)
+ *
+ * ptp4l runs in software-timestamping mode on the TAP interface:
+ *
+ * ptp4l -i dtap0 -m -s -S # -S = software timestamps
+ *
+ * Topology:
+ *
+ * Time Transmitter (remote) ──L2── Physical NIC (DPDK)
+ * │
+ * PTP SW Relay ← correctionField update
+ * │
+ * TAP (kernel) ── ptp4l -S (time receiver)
+ *
+ * Usage:
+ * dpdk-ptp_tap_relay_sw -l 0-1 --vdev=net_tap0,iface=dtap0 -- \
+ * -p 0 -t 1
+ *
+ * Parameters:
+ * -p PORT Physical NIC port ID (default: 0)
+ * -t PORT TAP port ID (default: 1)
+ * -T SECS Stats print interval in seconds (default: 10)
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <getopt.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_mbuf.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+
+#include "ptp_parse.h"
+
+/* Ring sizes */
+#define RX_RING_SIZE 1024
+#define TX_RING_SIZE 1024
+
+/* Mempool */
+#define NUM_MBUFS 8191
+#define MBUF_CACHE 250
+#define BURST_SIZE 32
+
+#define NSEC_PER_SEC 1000000000ULL
+
+/* Logging helpers */
+#define LOG_INFO(fmt, ...) \
+ fprintf(stdout, "[PTP-SW] " fmt "\n", ##__VA_ARGS__)
+#define LOG_ERR(fmt, ...) \
+ fprintf(stderr, "[PTP-SW ERROR] " fmt "\n", ##__VA_ARGS__)
+
+static volatile bool force_quit;
+
+/* Port IDs */
+static uint16_t phy_port;
+static uint16_t tap_port = 1;
+static unsigned int stats_interval = 10; /* seconds */
+
+/* Statistics */
+static struct {
+ uint64_t phy_rx; /* total packets from PHY */
+ uint64_t phy_rx_ptp; /* PTP packets from PHY */
+ uint64_t tap_tx; /* packets forwarded to TAP */
+ uint64_t tap_rx; /* total packets from TAP */
+ uint64_t tap_rx_ptp; /* PTP packets from TAP */
+ uint64_t phy_tx; /* packets forwarded to PHY */
+ uint64_t corrections; /* correctionField updates */
+} stats;
+
+static void
+signal_handler(int signum)
+{
+ if (signum == SIGINT || signum == SIGTERM) {
+ LOG_INFO("Signal %d received, shutting down...", signum);
+ force_quit = true;
+ }
+}
+
+/* Helpers */
+
+/* Read monotonic clock in nanoseconds (for residence time). */
+static inline uint64_t
+sw_timestamp_ns(void)
+{
+ struct timespec ts;
+
+ clock_gettime(CLOCK_MONOTONIC, &ts);
+ return (uint64_t)ts.tv_sec * NSEC_PER_SEC + (uint64_t)ts.tv_nsec;
+}
+
+/* Port Init */
+
+static int
+port_init(uint16_t port, struct rte_mempool *mp)
+{
+ struct rte_eth_conf port_conf;
+ struct rte_eth_dev_info dev_info;
+ uint16_t nb_rxd = RX_RING_SIZE;
+ uint16_t nb_txd = TX_RING_SIZE;
+ int ret;
+
+ memset(&port_conf, 0, sizeof(port_conf));
+
+ ret = rte_eth_dev_info_get(port, &dev_info);
+ if (ret != 0) {
+ LOG_ERR("rte_eth_dev_info_get(port %u) failed: %d", port, ret);
+ return ret;
+ }
+
+ if (dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE)
+ port_conf.txmode.offloads |=
+ RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;
+
+ ret = rte_eth_dev_configure(port, 1, 1, &port_conf);
+ if (ret != 0)
+ return ret;
+
+ ret = rte_eth_dev_adjust_nb_rx_tx_desc(port, &nb_rxd, &nb_txd);
+ if (ret != 0)
+ return ret;
+
+ ret = rte_eth_rx_queue_setup(port, 0, nb_rxd,
+ rte_eth_dev_socket_id(port), NULL, mp);
+ if (ret < 0)
+ return ret;
+
+ ret = rte_eth_tx_queue_setup(port, 0, nb_txd,
+ rte_eth_dev_socket_id(port), NULL);
+ if (ret < 0)
+ return ret;
+
+ ret = rte_eth_dev_start(port);
+ if (ret < 0)
+ return ret;
+
+ ret = rte_eth_promiscuous_enable(port);
+ if (ret != 0) {
+ LOG_ERR("Failed to enable promiscuous on port %u: %s",
+ port, rte_strerror(-ret));
+ return ret;
+ }
+
+ return 0;
+}
+
+/* Relay one direction */
+
+/*
+ * Forward packets from src_port to dst_port.
+ * For PTP event messages, record SW timestamps around the
+ * relay path and add the residence time to the correctionField.
+ *
+ * This implements a Transparent Clock (IEEE 1588-2019 §10.2):
+ * correctionField += (t_egress − t_ingress)
+ *
+ * Note: a single rx_ts / tx_ts pair is used for the entire burst.
+ * At typical PTP rates (logSyncInterval >= -4, i.e. <= 16 pkt/s)
+ * bursts contain at most one packet, so this is exact. At higher
+ * rates, early packets in a burst are slightly under-corrected and
+ * late ones over-corrected by up to one poll-loop iteration.
+ */
+static void
+relay_burst(uint16_t src_port, uint16_t dst_port,
+ uint64_t *rx_cnt, uint64_t *rx_ptp_cnt,
+ uint64_t *tx_cnt, uint64_t *corr_cnt)
+{
+ struct rte_mbuf *bufs[BURST_SIZE];
+ struct rte_ptp_hdr *ptp_hdrs[BURST_SIZE];
+ uint64_t rx_ts;
+ uint16_t nb_rx, nb_tx, i;
+
+ nb_rx = rte_eth_rx_burst(src_port, 0, bufs, BURST_SIZE);
+ if (nb_rx == 0)
+ return;
+
+ /* Record a single RX software timestamp for the whole burst.
+ * All packets in one burst arrived at essentially the same instant
+ * from rte_eth_rx_burst()'s perspective.
+ */
+ rx_ts = sw_timestamp_ns();
+
+ *rx_cnt += nb_rx;
+
+ /*
+ * Pass 1: Parse each packet once and remember PTP event headers.
+ * This avoids taking the TX timestamp too early — we want it as
+ * close to the actual rte_eth_tx_burst() call as possible.
+ */
+ memset(ptp_hdrs, 0, sizeof(ptp_hdrs[0]) * nb_rx);
+ for (i = 0; i < nb_rx; i++) {
+ struct rte_ptp_hdr *hdr = ptp_hdr_find(bufs[i]);
+
+ if (hdr == NULL)
+ continue;
+
+ (*rx_ptp_cnt)++;
+
+ /* Only event messages carry timestamps that need correction */
+ if (!rte_ptp_is_event(rte_ptp_msg_type(hdr)))
+ continue;
+
+ ptp_hdrs[i] = hdr;
+ }
+
+ /*
+ * Pass 2: Take a single TX timestamp right before transmission.
+ * This minimises the gap between the measured tx_ts and the
+ * actual kernel write inside rte_eth_tx_burst(), giving the
+ * most accurate residence time we can achieve with SW timestamps.
+ *
+ * residence_time = tx_ts − rx_ts
+ *
+ * Remaining untracked delays:
+ * - Pre-RX: NIC DMA → rx_burst return (~1-5 µs, unavoidable)
+ * - Post-TX: tx_ts → kernel TAP write (~1-2 µs)
+ * Both are symmetric for Sync and Delay_Req so they largely
+ * cancel in the ptp4l offset calculation.
+ */
+ uint64_t tx_ts = sw_timestamp_ns();
+ int64_t residence_ns = (int64_t)(tx_ts - rx_ts);
+
+ for (i = 0; i < nb_rx; i++) {
+ if (ptp_hdrs[i] == NULL)
+ continue;
+ rte_ptp_add_correction(ptp_hdrs[i], residence_ns);
+ (*corr_cnt)++;
+ }
+
+ /* Forward the burst */
+ nb_tx = rte_eth_tx_burst(dst_port, 0, bufs, nb_rx);
+ *tx_cnt += nb_tx;
+
+ /* Free any unsent packets */
+ for (i = nb_tx; i < nb_rx; i++)
+ rte_pktmbuf_free(bufs[i]);
+}
+
+/* Print statistics */
+
+static void
+print_stats(void)
+{
+ LOG_INFO("=== Statistics ===");
+ LOG_INFO(" PHY RX total: %"PRIu64, stats.phy_rx);
+ LOG_INFO(" PHY RX PTP: %"PRIu64, stats.phy_rx_ptp);
+ LOG_INFO(" TAP TX: %"PRIu64, stats.tap_tx);
+ LOG_INFO(" TAP RX total: %"PRIu64, stats.tap_rx);
+ LOG_INFO(" TAP RX PTP: %"PRIu64, stats.tap_rx_ptp);
+ LOG_INFO(" PHY TX: %"PRIu64, stats.phy_tx);
+ LOG_INFO(" Corrections: %"PRIu64, stats.corrections);
+}
+
+/* Main relay loop */
+
+static int
+relay_loop(__rte_unused void *arg)
+{
+ uint64_t last_stats = rte_rdtsc();
+ uint64_t stats_tsc = rte_get_tsc_hz() * stats_interval;
+
+ LOG_INFO("Relay loop started on lcore %u", rte_lcore_id());
+ LOG_INFO(" PHY port %u <--> TAP port %u", phy_port, tap_port);
+ LOG_INFO(" Correction field updates: enabled for event messages");
+
+ while (!force_quit) {
+ /* PHY → TAP */
+ relay_burst(phy_port, tap_port,
+ &stats.phy_rx, &stats.phy_rx_ptp,
+ &stats.tap_tx, &stats.corrections);
+
+ /* TAP → PHY */
+ relay_burst(tap_port, phy_port,
+ &stats.tap_rx, &stats.tap_rx_ptp,
+ &stats.phy_tx, &stats.corrections);
+
+ /* Periodic stats */
+ if (rte_rdtsc() - last_stats > stats_tsc) {
+ print_stats();
+ last_stats = rte_rdtsc();
+ }
+ }
+
+ print_stats();
+ return 0;
+}
+
+/* Argument parsing */
+
+static void
+usage(const char *prog)
+{
+ fprintf(stderr,
+ "Usage: %s [EAL options] -- [options]\n"
+ " -p PORT Physical NIC port ID (default: 0)\n"
+ " -t PORT TAP port ID (default: 1)\n"
+ " -T SECS Stats interval in seconds (default: 10)\n"
+ "\n"
+ "Example:\n"
+ " %s -l 0-1 --vdev=net_tap0,iface=dtap0 -- -p 0 -t 1\n"
+ "\n"
+ "Then run ptp4l with software timestamps:\n"
+ " ptp4l -i dtap0 -m -s -S\n",
+ prog, prog);
+}
+
+static int
+parse_args(int argc, char **argv)
+{
+ int opt;
+
+ while ((opt = getopt(argc, argv, "p:t:T:h")) != -1) {
+ switch (opt) {
+ case 'p':
+ phy_port = (uint16_t)atoi(optarg);
+ break;
+ case 't':
+ tap_port = (uint16_t)atoi(optarg);
+ break;
+ case 'T':
+ stats_interval = (unsigned int)atoi(optarg);
+ break;
+ case 'h':
+ default:
+ usage(argv[0]);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/* Main */
+
+int
+main(int argc, char **argv)
+{
+ struct rte_mempool *mp;
+ uint16_t nb_ports;
+ int ret;
+
+ /* EAL init */
+ ret = rte_eal_init(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "EAL init failed\n");
+ argc -= ret;
+ argv += ret;
+
+ /* App args */
+ ret = parse_args(argc, argv);
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
+ signal(SIGINT, signal_handler);
+ signal(SIGTERM, signal_handler);
+
+ nb_ports = rte_eth_dev_count_avail();
+ if (nb_ports < 2)
+ rte_exit(EXIT_FAILURE,
+ "Need at least 2 ports (PHY + TAP).\n"
+ "Use --vdev=net_tap0,iface=dtap0\n");
+
+ if (!rte_eth_dev_is_valid_port(phy_port))
+ rte_exit(EXIT_FAILURE, "Invalid PHY port %u\n", phy_port);
+ if (!rte_eth_dev_is_valid_port(tap_port))
+ rte_exit(EXIT_FAILURE, "Invalid TAP port %u\n", tap_port);
+
+ mp = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
+ MBUF_CACHE, 0,
+ RTE_MBUF_DEFAULT_BUF_SIZE,
+ rte_socket_id());
+ if (mp == NULL)
+ rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+
+ LOG_INFO("Initializing PHY port %u...", phy_port);
+ ret = port_init(phy_port, mp);
+ if (ret != 0)
+ rte_exit(EXIT_FAILURE, "Cannot init PHY port %u (%d)\n",
+ phy_port, ret);
+
+ LOG_INFO("Initializing TAP port %u...", tap_port);
+ ret = port_init(tap_port, mp);
+ if (ret != 0)
+ rte_exit(EXIT_FAILURE, "Cannot init TAP port %u (%d)\n",
+ tap_port, ret);
+
+ LOG_INFO("PTP Software Relay ready");
+ LOG_INFO(" PHY port: %u", phy_port);
+ LOG_INFO(" TAP port: %u", tap_port);
+ LOG_INFO(" Stats every: %u seconds", stats_interval);
+ LOG_INFO(" Correction: Transparent Clock (SW timestamps)");
+ LOG_INFO("");
+ LOG_INFO("Run ptp4l: ptp4l -i dtap0 -m -s -S");
+
+ /* Run relay on main lcore */
+ relay_loop(NULL);
+
+ /* Cleanup */
+ LOG_INFO("Stopping ports...");
+ rte_eth_dev_stop(phy_port);
+ rte_eth_dev_stop(tap_port);
+ rte_eth_dev_close(phy_port);
+ rte_eth_dev_close(tap_port);
+ rte_eal_cleanup();
+
+ return 0;
+}
--
2.53.0
More information about the dev
mailing list