[RFC PATCH 1/5] eventdev: add power monitoring API on event port
    Jerin Jacob 
    jerinjacobk at gmail.com
       
    Wed May  3 10:26:33 CEST 2023
    
    
  
On Wed, May 3, 2023 at 1:44 PM Ferruh Yigit <ferruh.yigit at amd.com> wrote:
>
> On 5/3/2023 8:58 AM, Jerin Jacob wrote:
> > On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit <ferruh.yigit at amd.com> wrote:
> >>
> >> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
> >>> On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit <ferruh.yigit at amd.com> wrote:
> >>>>
> >>>> On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> >>>>> On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
> >>>>> <sivaprasad.tummala at amd.com> wrote:
> >>>>>>
> >>>>>> A new API to allow power monitoring condition on event port to
> >>>>>> optimize power when no events are arriving on an event port for
> >>>>>> the worker core to process in an eventdev based pipelined application.
> >>>>>>
> >>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala at amd.com>
> >>>>>> + *
> >>>>>> + * @param dev_id
> >>>>>> + *   Eventdev id
> >>>>>> + * @param port_id
> >>>>>> + *   Eventdev port id
> >>>>>> + * @param pmc
> >>>>>> + *   The pointer to power-optimized monitoring condition structure.
> >>>>>> + *
> >>>>>> + * @return
> >>>>>> + *   - 0: Success.
> >>>>>> + *   -ENOTSUP: Operation not supported.
> >>>>>> + *   -EINVAL: Invalid parameters.
> >>>>>> + *   -ENODEV: Invalid device ID.
> >>>>>> + */
> >>>>>> +__rte_experimental
> >>>>>> +int
> >>>>>> +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
> >>>>>> +               struct rte_power_monitor_cond *pmc);
> >>>>>
> >>>>> + eventdev driver maintainers
> >>>>>
> >>>>> I think, we don't need to expose this application due to applications
> >>>>> 1)To make applications to be transparent whether power saving is enabled or not?
> >>>>> 2)Some HW and Arch already supports power managent in driver and in HW
> >>>>> (Not using  CPU architecture directly)
> >>>>>
> >>>>> If so, that will be translated to following,
> >>>>> a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id, uint8_t
> >>>>> port_id, bool ena) for controlling power saving in slowpath.
> >>>>> b) Create reusable PMD private function based on the CPU architecture
> >>>>> power saving primitive to cover the PMD don't have native power saving
> >>>>> support.
> >>>>> c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
> >>>>>
> >>>>>
> >>>>
> >>>> Hi Jerin,
> >>>
> >>> Hi Ferruh,
> >>>
> >>>>
> >>>> ethdev approach seems applied here.
> >>>
> >>> Understands that. But none of the NIC HW supports power management at
> >>> HW level like eventdev, so that way
> >>> for what we are doing for ethdev is a correct abstraction for ethdev.
> >>>
> >>
> >> What I understand is there is HW based event manager and SW based ones,
> >> SW based ones can benefit more from CPU power optimizations, for HW
> >> event managers if there is not enough benefit they can just ignore the
> >> feature.
> >>
> >>>>
> >>>> In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
> >>>> 'rte_eth_get_monitor_addr()'.
> >>>>
> >>>> Although 'rte_eth_get_monitor_addr()' is public API, it is currently
> >>>> only called from Rx/Tx callback functions implemented in the power library.
> >>>> But I assume intention to make it public is to enable users to implement
> >>>> their own callback functions that has custom algorithm for the power
> >>>> management.
> >>>
> >>> If there is a use case for customizing with own callback, we can provide that.
> >>> Provided NULL is valid with default algorithm.
> >>>
> >>>>
> >>>> And probably same is true for the 'rte_event_port_get_monitor_addr()'.
> >>>>
> >>>>
> >>>> Also instead of implementing power features for withing PMDs, isn't it
> >>>> better to have a common eventdev layer for it?
> >>>
> >>> We can have rte_evetdev_pmd_* APIs as non-public APIs.
> >>> My only objection is to NOT introduce _monitor_ APIs at eventdev level,
> >>> Instead, _monitor_ is one way to do it in SW, So we need higher level
> >>> of abstraction.
> >>>
> >>
> >> I see, this seems a trade off between flexibility and usability. If
> >> application has access to _monitor_ APIs, they can be more flexible to
> >> implement their own logic.
> >
> > OK.
> >
> >>
> >> Another option can be application provides the policy with an API and
> >> monitor API used to realize the policy, but for this case it can be
> >> challenge to find and implement correct policies.
> >
> > OK. If we can enumerate the policies, then it will be ideal.
> > On plus side, there will not be any changes in needed in lib/power/
> >
>
> If we are talking about a power framework that user defines policies, I
> expect parsing/defining policies will be in the power library and will
> require changes in the power library anyway.
OK
>
> But as mentioned above it is difficult to define a proper policy, this
> is not really related to eventdev, more a power library issue. We can
> continue to provide flexibility to user in eventdev and discuss the
> policy if a wider forum.
OK.
>
> >
> >>
> >>>>
> >>>> For the PMDs benefit from HW event manager, just not implementing
> >>>> .get_monitor_addr() dev_ops will make them free from power related APIs.
> >>>
> >>> But application fast path code gets diverged by exposing low level primitives.
> >>>
> >>
> >> I am not clear with concern above, but for application that use default
> >> callbacks, 'rte_power_eventdev_pmgmt_port_enable()' needs to be called
> >> to enable this feature, if not called datapath is not impacted.
> >> And if not dequeue callback added at all, custom or default, data path
> >> is not impacted at all.
> >
> > Concerns are around following code[1] when callback is not registered
> > for this use case.
> > In eventdev, we are using _one packet at a time_ for a lot of use case
> > with latency critical workload like L1 processing.
> > On such cases, the following code will add up.
> >
> > [1]
> >   cb = __atomic_load_n((void **)&fp_ops->ev_port.clbk[port_id],
> >     __ATOMIC_RELAXED);
> >   if (unlikely(cb != NULL))
> >        nb_rx = rte_event_dequeue_callbacks(dev_id, port_id, ev, nb_rx, cb);
> >
> > I see two options,
> > 1) Enumerate the power policy and let driver implement through
> > non-public PMD helper functions
> > OR
> > 2)Move the power management callback to driver via non-public PMD
> > helper functions to avoid cost of
> > PMDs where power managment done in HW and to remove above extra check
> > when NO callback is registered[1]
> >
>
> Got it, yes there is an additional check with event callbacks, we can
> add a compiler flag around it as done in ethdev to let it not compiled
> when not needed, will it work?
I would prefer to expose PMD helper function which can be called at
end of the driver dequeue function so that other PMD can reuse as
needed.
This is to avoid compiler flag, cache line occupancy changes in struct
rte_eventdev
struct rte_event_fp_ops in generic code also we may not need
full-fledged generic callbacks scheme for this.
>
    
    
More information about the dev
mailing list