[dpdk-dev] [PATCH v10 0/9] Add PMD power mgmt

McDaniel, Timothy timothy.mcdaniel at intel.com
Wed Oct 28 17:54:41 CET 2020


> -----Original Message-----
> From: Ma, Liang J <liang.j.ma at intel.com>
> Sent: Wednesday, October 28, 2020 11:48 AM
> To: Jerin Jacob <jerinjacobk at gmail.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Thomas Monjalon
> <thomas at monjalon.net>; dpdk-dev <dev at dpdk.org>; Ruifeng Wang (Arm
> Technology China) <ruifeng.wang at arm.com>; Wang, Haiyue
> <haiyue.wang at intel.com>; Richardson, Bruce <bruce.richardson at intel.com>;
> Hunt, David <david.hunt at intel.com>; Neil Horman <nhorman at tuxdriver.com>;
> McDaniel, Timothy <timothy.mcdaniel at intel.com>; Eads, Gage
> <gage.eads at intel.com>; Marcin Wojtas <mw at semihalf.com>; Guy Tzalik
> <gtzalik at amazon.com>; Ajit Khaparde <ajit.khaparde at broadcom.com>;
> Harman Kalra <hkalra at marvell.com>; John Daley <johndale at cisco.com>; Wei
> Hu (Xavier <xavier.huwei at huawei.com>; Ziyang Xuan
> <xuanziyang2 at huawei.com>; matan at nvidia.com; Yong Wang
> <yongwang at vmware.com>; david.marchand at redhat.com
> Subject: Re: [PATCH v10 0/9] Add PMD power mgmt
> 
> On 28 Oct 21:27, Jerin Jacob wrote:
> > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin
> > <konstantin.ananyev at intel.com> wrote:
> > > > > > > > 28/10/2020 14:49, Jerin Jacob:
> > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma
> <liang.j.ma at intel.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Thomas,
> > > > > > > > > >   I think I addressed all of the questions in relation to V9. I don't
> think I can solve the issue of a generic API on my own. From the
> > > > > > > > Community Call last week Jerin also said that a generic was
> investigated but that a single solution wasn't feasible.
> > > > > > > > >
> > > > > > > > > I think, From the architecture point of view, the specific
> > > > > > > > > functionally of UMONITOR may not be abstracted.
> > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in
> > > > > > > > > such a way that packet notification available through
> > > > > > > > > checking interrupt status register or ring descriptor location, etc
> by
> > > > > > > > > the driver. Use that callback as a notification mechanism rather
> > > > > > > > > than defining a memory-based scheme that UMONITOR expects?
> or similar
> > > > > > > > > thoughts on abstraction.
> > > > > > >
> > > > > > > I think there is probably some sort of misunderstanding.
> > > > > > > This API is not about providing acync notification when next packet
> arrives.
> > > > > > > This is about to putting core to sleep till some event (or timeout)
> happens.
> > > > > > > From my perspective the closest analogy: cond_timedwait().
> > > > > > > So we need PMD to tell us what will be the address of the condition
> variable
> > > > > > > we should sleep on.
> > > > > > >
> > > > > > > > I agree with Jerin.
> > > > > > > > The ethdev API is the blocking problem.
> > > > > > > > First problem: it is not well explained in doxygen.
> > > > > > > > Second problem: it is probably not generic enough (if we understand
> it well)
> > > > > > >
> > > > > > > It is an address to sleep(/wakeup) on, plus expected value.
> > > > > > > Honestly, I can't think-up of anything even more generic then that.
> > > > > > > If you guys have something particular in mind - please share.
> > > > > >
> > > > > > Current PMD callback:
> > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void
> > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t
> > > > > > *data_sz);
> > > > > >
> > > > > > Can we make it as
> > > > > > typedef void (*core_sleep_t)(void *rxq)
> > > > > >
> > > > > > if we do such abstraction and "move the polling on memory by
> HW/CPU"
> > > > > > to the driver using a helper function then
> > > > > > I can think of abstracting in some way in all PMDs.
> > > > >
> > > > > Ok I see, thanks for explanation.
> > > > > From my perspective main disadvantage of such approach -
> > > > > it can't be extended easily.
> > > > > If/when will have an ability for core to sleep/wake-up on multiple events
> > > > > (multiple addresses) will have to either rework that API again.
> > > >
> > > > I think, we can enumerate the policies and pass the associated
> > > > structures as input to the driver.
> > >
> > > What I am trying to say: with that API we will not be able to wait
> > > for events from multiple devices (HW queues).
> > > I.E. something like that:
> > >
> > > get_wake_addr(port=X, ..., &addr[0], ...);
> > > get_wake_addr(port=Y,..., &addr[1],...);
> > > wait_on_multi(addr, 2);
> > >
> > > wouldn't be possible.
> >
> > I see. But the current implementation dictates the only queue bound to
> > a core. Right?
> Current implementation only support 1:1 queue/core mapping is because of
> the limitation of umwait/umonitor which can not work with multiple address
> range. However, for other scheme like PASUE/Freq Scale have no such
> limitation.
> The proposed API itself doesn't limit the 1:1 queue/core mapping.
> >
> >
> > >
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > Note: core_sleep_t can take some more arguments such as
> enumerated
> > > > > > policy if something more needs to be pushed to the driver.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > > This API is experimental and other vendor support can be added
> as needed. If there are any other open issue let me know?
> > > > > > > >
> > > > > > > > Being experimental is not an excuse to throw something
> > > > > > > > which is not satisfying.
> > > > > > > >
> > > > > > > >
> > > > > > >

It would be nice if the low level definition of the UMWAIT and UMONOTOR instructions were split out
into their own inline function or macro so that any PMD could use the intrinsic without being tied to ethdev or
any of the other logic associated with this patch set.  This would be similar to rte_wmb, and so on




More information about the dev mailing list