[dpdk-dev] [PATCH] app/testpmd: fix txonly mode timestamp intitialization

Slava Ovsiienko viacheslavo at mellanox.com
Wed Jul 29 10:08:14 CEST 2020


Hi, Phil

Very nice comment, I found it is useful, I'm on moving mlx5 PMD to use C11 atomic primitives, thank you.
But, don't you think it is an overkill for testpmd case and would make code wordy and less readable ?
How does testpmd start the forwarding? Let's have a glance at launch_packet_forwarding():

- initialize forwarding with port_fwd_begin() - It is tx_only_begin for txonly mode, master core context
- launch the forwarding cores rte_eal_remote_launch(), the forwarding will happen in forward core context

Let's reconsider:

 - tx_only_begin() is called once in master core context, forwarding is not launched yet, there is NO any 
 concurrent access to variables, we can set ones in any way, NO atomic, volatile, etc. is needed at all.
 We should do setup and commit all our memory writes - rte_wmb() seems to be the very native way
to do it once for all preceding writes. Performance can be neglected here - it is not a datapath.

- pkt_burst_prepare() is being executed in the forwarding core context, and perform only READ access
 to the global variables, those are stable while forwarding lasts, and can be considered as constants.
 No atomic, barriers, etc. are needed either. I agree - the volatile is redundant ("over-reassurance"), let's
 remove it. But atomic access - not needed, would make code complicated.

What do you think? Do you insist on atomic-barriered access implementing?

With best regards, 
Slava

> -----Original Message-----
> From: Phil Yang <Phil.Yang at arm.com>
> Sent: Tuesday, July 28, 2020 19:24
> To: Slava Ovsiienko <viacheslavo at mellanox.com>
> Cc: Matan Azrad <matan at mellanox.com>; Raslan Darawsheh
> <rasland at mellanox.com>; Thomas Monjalon <thomas at monjalon.net>;
> ferruh.yigit at intel.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>; dev at dpdk.org; nd
> <nd at arm.com>
> Subject: RE: [dpdk-dev] [PATCH] app/testpmd: fix txonly mode timestamp
> intitialization
> 
> > -----Original Message-----
> > From: dev <dev-bounces at dpdk.org> On Behalf Of Viacheslav Ovsiienko
> > Sent: Monday, July 27, 2020 11:27 PM
> > To: dev at dpdk.org
> > Cc: matan at mellanox.com; rasland at mellanox.com;
> thomas at monjalon.net;
> > ferruh.yigit at intel.com
> > Subject: [dpdk-dev] [PATCH] app/testpmd: fix txonly mode timestamp
> > intitialization
> >
> > The testpmd application forwards data in multiple threads.
> > In the txonly mode the Tx timestamps must be initialized on per thread
> > basis to provide phase shift for the packet burst being sent. This per
> > thread initialization was performed on zero value of the variable in
> > thread local storage and happened only once after testpmd forwarding
> > start. Executing "start" and "stop" commands did not cause thread
> > local variables zeroing and wrong timestamp values were used.
> 
> I think it is too heavy to use rte_wmb() to guarantee the visibility of
> 'timestamp_init_req' updating for subsequent read operations.
> We can use C11 atomics with explicit memory ordering instead of rte_wmb()
> to achieve the same goal.
> 
> >
> > Fixes: 4940344dab1d ("app/testpmd: add Tx scheduling command")
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> > ---
> >  app/test-pmd/txonly.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c index
> > 97f4a45..415431d 100644
> > --- a/app/test-pmd/txonly.c
> > +++ b/app/test-pmd/txonly.c
> > @@ -55,9 +55,13 @@
> >  static struct rte_udp_hdr pkt_udp_hdr; /**< UDP header of tx packets.
> > */  RTE_DEFINE_PER_LCORE(uint64_t, timestamp_qskew);
> >  					/**< Timestamp offset per queue */
> > +RTE_DEFINE_PER_LCORE(uint32_t, timestamp_idone); /**< Timestamp
> init
> > done. */
> > +
> >  static uint64_t timestamp_mask; /**< Timestamp dynamic flag mask */
> > static int32_t timestamp_off; /**< Timestamp dynamic field offset */
> > static bool timestamp_enable; /**< Timestamp enable */
> > +static volatile uint32_t timestamp_init_req;
> 
> If we use C11 atomic builtins for 'timestamp_init_req' accessing, the volatile
> key word becomes unnecessary.
> Because they will generate same instructions.
> 
> > +				 /**< Timestamp initialization request. */
> >  static uint64_t timestamp_initial[RTE_MAX_ETHPORTS];
> >
> >  static void
> > @@ -229,7 +233,8 @@
> >  			rte_be64_t ts;
> >  		} timestamp_mark;
> >
> > -		if (unlikely(!skew)) {
> > +		if (unlikely(timestamp_init_req !=
> 
> if (unlikely(__atomic_load_n(&timestamp_init_req, __ATOMIC_RELAXED) !=
> 
> > +			RTE_PER_LCORE(timestamp_idone))) {
> >  			struct rte_eth_dev *dev = &rte_eth_devices[fs-
> > >tx_port];
> >  			unsigned int txqs_n = dev->data->nb_tx_queues;
> >  			uint64_t phase = tx_pkt_times_inter * fs->tx_queue /
> @@ -241,6
> > +246,7 @@
> >  			skew = timestamp_initial[fs->tx_port] +
> >  			       tx_pkt_times_inter + phase;
> >  			RTE_PER_LCORE(timestamp_qskew) = skew;
> > +			RTE_PER_LCORE(timestamp_idone) =
> > timestamp_init_req;
> 
> 
> RTE_PER_LCORE(timestamp_idone) =
> __atomic_load_n(&timestamp_init_req, __ATOMIC_RELAXED);
> 
> 
> >  		}
> >  		timestamp_mark.pkt_idx = rte_cpu_to_be_16(idx);
> >  		timestamp_mark.queue_idx = rte_cpu_to_be_16(fs-
> > >tx_queue);
> > @@ -426,6 +432,9 @@
> >  			   timestamp_mask &&
> >  			   timestamp_off >= 0 &&
> >  			   !rte_eth_read_clock(pi, &timestamp_initial[pi]);
> > +	if (timestamp_enable)
> > +		timestamp_init_req++;
> 
> 
> __atomic_add_fetch(&timestamp_init_req, 1, __ATOMIC_ACQ_REL);
> 
> 
> > +	rte_wmb();
> 
> We can remove it now.
> 
> 
> Thanks,
> Phil


More information about the dev mailing list