[dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Fri Oct 25 18:34:29 CEST 2019


Hi Pavan,

> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula at marvell.com>
> Sent: Friday, October 25, 2019 12:26 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>;
> jerinj at marvell.com
> Cc: dev at dpdk.org; nd <nd at arm.com>
> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> head
> 
> Hi Gavin,
> 
> >-----Original Message-----
> >From: dev <dev-bounces at dpdk.org> On Behalf Of Gavin Hu (Arm
> >Technology China)
> >Sent: Thursday, October 24, 2019 9:23 PM
> >To: Pavan Nikhilesh Bhagavatula <pbhagavatula at marvell.com>; Jerin
> >Jacob Kollanukkaran <jerinj at marvell.com>
> >Cc: dev at dpdk.org; nd <nd at arm.com>
> >Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >waiting for head
> >
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: pbhagavatula at marvell.com <pbhagavatula at marvell.com>
> >> Sent: Thursday, October 24, 2019 12:13 AM
> >> To: Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>;
> >> jerinj at marvell.com; Pavan Nikhilesh <pbhagavatula at marvell.com>
> >> Cc: dev at dpdk.org
> >> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting
> >for
> >> head
> >>
> >> From: Pavan Nikhilesh <pbhagavatula at marvell.com>
> >>
> >> Use wfe to save power while waiting for tag to become head.
> >>
> >> SSO signals EVENTI to allow cores to exit from wfe when they
> >> are waiting for specific operations in which one of them is
> >> setting HEAD bit in GWS_TAG.
> >>
> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula at marvell.com>
> >> ---
> >>  drivers/event/octeontx2/otx2_worker.h | 30
> >++++++++++++++++++++++++--
> >> -
> >>  1 file changed, 27 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/event/octeontx2/otx2_worker.h
> >> b/drivers/event/octeontx2/otx2_worker.h
> >> index 4e971f27c..7a55caca5 100644
> >> --- a/drivers/event/octeontx2/otx2_worker.h
> >> +++ b/drivers/event/octeontx2/otx2_worker.h
> >> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
> >otx2_ssogws *ws)
> >>  }
> >>
> >>  static __rte_always_inline void
> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> >>  {
> >> -	while (wait_flag && !(otx2_read64(ws->tag_op) &
> >BIT_ULL(35)))
> >> +#ifdef RTE_ARCH_ARM64
> >> +	uint64_t tag;
> >> +
> >> +	asm volatile (
> >> +			"	ldr %[tag], [%[tag_op]]		\n"
> >"ldxr" should be used, exclusive-load is required to "monitor" the
> >location, then a write to the location will cause clear of the exclusive
> >monitor, thus a wake up event is generated implicitly.
> 
> As I have mentioned in the commit log:
> "SSO signals EVENTI to allow cores to exit from wfe when they
> are waiting for specific operations in which one of them is
> setting HEAD bit in GWS_TAG."
If you have other expected wake up sources, that is ok. Just curious is this signal explicitly sent to quit WFE? 
Just wondering, implicit event(Clear of exclusive monitor) vs explicit signal, which has shorter latency?
/Gavin
> 
> The address need not be tracked by the global monitor.
> 
> >You can find more explanation is here:
> >https://urldefense.proofpoint.com/v2/url?u=http-
> >3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068F
> >6B0-
> >40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKjW
> >ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6F
> >N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
> >GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
> >iTiJXU0Vwxr4&e=
> >/Gavin
> >> +			"	tbnz %[tag], 35, done%=
> >	\n"
> >> +			"	sevl				\n"
> >> +			"rty%=:	wfe				\n"
> >> +			"	ldr %[tag], [%[tag_op]]		\n"
> >> +			"	tbz %[tag], 35, rty%=		\n"
> >> +			"done%=:				\n"
> >> +			: [tag] "=&r" (tag)
> >> +			: [tag_op] "r" (ws->tag_op)
> >> +			);
> >> +#else
> >> +	/* Wait for the HEAD to be set */
> >> +	while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> >>  		;
> >> +#endif
> >> +}
> >> +
> >> +static __rte_always_inline void
> >> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +{
> >> +	if (wait_flag)
> >> +		otx2_ssogws_head_wait(ws);
> >>
> >>  	rte_cio_wmb();
> >What ordering does this barrier try to keep?  If there is a write then wait
> >for kind of response, should this barrier move before
> >otx2_ssogws_head_wait?
> 
> The barrier is used to flush out write buffer to LLC (octeontx2 point of
> coherence) so
> that NIX Tx picks up all the modifications done to the packet.
Looking at the otx2_ssogws_event_tx function, so far at the point of rte_cio_wmb, only the header is written?
Should it be delayed after the whole packet written and before the submission? 
If NIX is not falling within the SMP configuration, should it be rte_io_wmb instead?
/Gavin
> >>  }
> >> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
> >*ws,
> >> struct rte_event ev[],
> >>
> >>  	/* Perform header writes before barrier for TSO */
> >>  	otx2_nix_xmit_prepare_tso(m, flags);
> >> -	otx2_ssogws_head_wait(ws, !ev->sched_type);
> >> +	otx2_ssogws_order(ws, !ev->sched_type);
> >>  	otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
> >>
> >>  	if (flags & NIX_TX_MULTI_SEG_F) {
> >> --
> >> 2.17.1



More information about the dev mailing list