[dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet	Txscheduling
    Morten Brørup 
    mb at smartsharesystems.com
       
    Wed Jul  8 17:27:19 CEST 2020
    
    
  
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Slava Ovsiienko
> Sent: Wednesday, July 8, 2020 4:54 PM
> 
> Hi, Morten
> 
> Thank you for the comments. Please, see below.
> 
> > -----Original Message-----
> > From: Morten Brørup <mb at smartsharesystems.com>
> > Sent: Wednesday, July 8, 2020 17:16
> > To: Slava Ovsiienko <viacheslavo at mellanox.com>; dev at dpdk.org
> > Cc: Matan Azrad <matan at mellanox.com>; Raslan Darawsheh
> > <rasland at mellanox.com>; olivier.matz at 6wind.com;
> > bernard.iremonger at intel.com; thomas at mellanox.net
> > Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate
> packet
> > Txscheduling
> >
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Viacheslav
> > > Ovsiienko
> > > Sent: Tuesday, July 7, 2020 4:57 PM
> > >
> > > There is the requirement on some networks for precise traffic
> timing
> > > management. The ability to send (and, generally speaking, receive)
> the
> > > packets at the very precisely specified moment of time provides the
> > > opportunity to support the connections with Time Division
> Multiplexing
> > > using the contemporary general purpose NIC without involving an
> > > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > > interface is one of the promising features for potentially usage of
> > > the precise time management for the egress packets.
> > >
> > > The main objective of this RFC is to specify the way how
> applications
> > > can provide the moment of time at what the packet transmission must
> be
> > > started and to describe in preliminary the supporting this feature
> > > from
> > > mlx5 PMD side.
> > >
> > > The new dynamic timestamp field is proposed, it provides some
> timing
> > > information, the units and time references (initial phase) are not
> > > explicitly defined but are maintained always the same for a given
> port.
> > > Some devices allow to query rte_eth_read_clock() that will return
> the
> > > current device timestamp. The dynamic timestamp flag tells whether
> the
> > > field contains actual timestamp value. For the packets being sent
> this
> > > value can be used by PMD to schedule packet sending.
> > >
> > > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and
> > > obsoleting, these dynamic flag and field will be used to manage the
> > > timestamps on receiving datapath as well.
> > >
> > > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > > sent it tries to synchronize the time of packet appearing on the
> wire
> > > with the specified packet timestamp. If the specified one is in the
> > > past it should be ignored, if one is in the distant future it
> should
> > > be capped with some reasonable value (in range of seconds). These
> > > specific cases ("too late" and "distant future") can be optionally
> > > reported via device xstats to assist applications to detect the
> > > time-related problems.
> > >
> > > There is no any packet reordering according timestamps is supposed,
> > > neither within packet burst, nor between packets, it is an entirely
> > > application responsibility to generate packets and its timestamps
> in
> > > desired order. The timestamps can be put only in the first packet
> in
> > > the burst providing the entire burst scheduling.
> > >
> > > PMD reports the ability to synchronize packet sending on timestamp
> > > with new offload flag:
> > >
> > > This is palliative and is going to be replaced with new eth_dev API
> > > about reporting/managing the supported dynamic flags and its
> related
> > > features. This API would break ABI compatibility and can't be
> > > introduced at the moment, so is postponed to 20.11.
> > >
> > > For testing purposes it is proposed to update testpmd "txonly"
> > > forwarding mode routine. With this update testpmd application
> > > generates the packets and sets the dynamic timestamps according to
> > > specified time pattern if it sees the "rte_dynfield_timestamp" is
> registered.
> > >
> > > The new testpmd command is proposed to configure sending pattern:
> > >
> > > set tx_times <burst_gap>,<intra_gap>
> > >
> > > <intra_gap> - the delay between the packets within the burst
> > >               specified in the device clock units. The number
> > >               of packets in the burst is defined by txburst
> parameter
> > >
> > > <burst_gap> - the delay between the bursts in the device clock
> units
> > >
> > > As the result the bursts of packet will be transmitted with
> specific
> > > delays between the packets within the burst and specific delay
> between
> > > the bursts. The rte_eth_get_clock is supposed to be engaged to get
> the
> > > current device clock value and provide the reference for the
> > > timestamps.
> > >
> > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> > > ---
> > >  v1->v4:
> > >     - dedicated dynamic Tx timestamp flag instead of shared with Rx
> >
> > The detailed description above should be updated to reflect that it
> is now
> > two flags.
> OK
> 
> >
> > >     - Doxygen-style comment
> > >     - comments update
> > >
> > > ---
> > >  lib/librte_ethdev/rte_ethdev.c |  1 +
> lib/librte_ethdev/rte_ethdev.h
> > > |  4 ++++  lib/librte_mbuf/rte_mbuf_dyn.h | 31
> > > +++++++++++++++++++++++++++++++
> > >  3 files changed, 36 insertions(+)
> > >
> > > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.c
> > > +++ b/lib/librte_ethdev/rte_ethdev.c
> > > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> > >  	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > >  	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > >  	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > > +	RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > >  };
> > >
> > >  #undef RTE_TX_OFFLOAD_BIT2STR
> > > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.h
> > > +++ b/lib/librte_ethdev/rte_ethdev.h
> > > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > >  /** Device supports outer UDP checksum */  #define
> > > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
> > >
> > > +/** Device supports send on timestamp */ #define
> > > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > > +
> > > +
> > >  #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> > /**<
> > > Device supports Rx queue setup after device started*/  #define
> > > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> > > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > index 96c3631..7e9f7d2 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> > "rte_flow_dynfield_metadata"
> > >  #define RTE_MBUF_DYNFLAG_METADATA_NAME
> > "rte_flow_dynflag_metadata"
> > >
> > > +/**
> > > + * The timestamp dynamic field provides some timing information,
> the
> > > + * units and time references (initial phase) are not explicitly
> > > defined
> > > + * but are maintained always the same for a given port. Some
> devices
> > > allow4
> > > + * to query rte_eth_read_clock() that will return the current
> device
> > > + * timestamp. The dynamic Tx timestamp flag tells whether the
> field
> > > contains
> > > + * actual timestamp value. For the packets being sent this value
> can
> > > be
> > > + * used by PMD to schedule packet sending.
> > > + *
> > > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field
> deprecation
> > > + * and obsoleting, the dedicated Rx timestamp flag is supposed to
> be
> > > + * introduced and the shared dynamic timestamp field will be used
> > > + * to handle the timestamps on receiving datapath as well.
> > > + */
> > > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> > "rte_dynfield_timestamp"
> >
> > The description above should not say anything about the dynamic TX
> > timestamp flag.
> It does not. Or do you mean RX?
> Not sure, field and flag are tightly coupled,
> it is nice to mention this relation for better understanding.
> And mentioning the RX explains why it is not like this:
> RTE_MBUF_DYNFIELD_[TX]_TIMESTAMP_NAME
Sorry. I misunderstood its purpose!
It's the name of the field, and the field will not only be used for RX, but in the future also for RX.
(I thought it was the name of the RX flag, reserved for future use.)
> 
> >
> > Please elaborate "some timing information", e.g. add "... about when
> the
> > packet was received".
> 
> Sorry, I do not follow,  currently the dynamic field is not
> "about when the packet was received". Now it is introduced for Tx
> only and just the opportunity to be shared with Rx one in coming
> releases
> is mentioned. "Some" means - not specified (herein) exactly.
> And it is elaborated what Is not specified and how it is supposed
> to use Tx timestamp.
It should be described when it is valid, and how it is being used, e.g. by adding a reference to the "rte_dynflag_tx_timestamp" flag.
> >
> > > +
> > > +/**
> > > + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag
> > set on
> > > the
> > > + * packet being sent it tries to synchronize the time of packet
> > > appearing
> > > + * on the wire with the specified packet timestamp. If the
> specified
> > > one
> > > + * is in the past it should be ignored, if one is in the distant
> > > future
> > > + * it should be capped with some reasonable value (in range of
> > > seconds).
> > > + *
> > > + * There is no any packet reordering according to timestamps is
> > > supposed,
> > > + * neither for packet within the burst, nor for the whole bursts,
> it
> > > is
> > > + * an entirely application responsibility to generate packets and
> its
> > > + * timestamps in desired order. The timestamps might be put only
> in
> > > + * the first packet in the burst providing the entire burst
> > > scheduling.
> > > + */
> > > +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> > "rte_dynflag_tx_timestamp"
> > > +
> > >  #endif
> > > --
> > > 1.8.3.1
> > >
> >
> > It may be worth adding some documentation about how the clocks of the
> > NICs are out of sync with the clock of the CPU, and are all drifting
> relatively.
> >
> > And those clocks are also out of sync with the actual time (NTP
> clock).
> 
> IMO, It is out of scope of this very generic patch.  As for mlx NICs -
> the internal device
> clock might be (or might be not) synchronized with PTP, it can provide
> timestamps
> in real nanoseconds in various formats or just some free running
> counter.
Cool!
> On some systems the NIC and CPU might share the same clock source (for
> their PLL inputs
> for example) and there will be no any drifts. As we can see - it is a
> wide and interesting
> opic to discuss, but, IMO,  the comment in header file might be not the
> most relevant
> place to do. As for mlx5 devices clock specifics - it will be
> documented in PMD chapter.
> 
> OK, will add few generic words, the few ones - in order not to make
> comment wordy, just
> point the direction for further thinking.
I agree - we don't want cookbooks in the header files. Only enough description to avoid the worst misunderstandings.
> 
> >
> > Preferably, some sort of cookbook for handling this should be
> provided.
> > PCAP could be used as an example.
> >
> testpmd example is included in series, mlx5 PMD patch is prepared and
> coming soon.
Great.
And I suppose that the more detailed cookbook/example - regarding offset and drift of various clocks - is probably more relevant for the RX side (for various PCAP applications), and thus completely unrelated to this patch.
> 
> With best regards, Slava
    
    
More information about the dev
mailing list