[AMD Official Use Only - General]

Hi Jerin, 

> -----Original Message-----
> From: Jerin Jacob <jerinjac...@gmail.com>
> Sent: Wednesday, May 3, 2023 1:57 PM
> To: Yigit, Ferruh <ferruh.yi...@amd.com>
> Cc: Tummala, Sivaprasad <sivaprasad.tumm...@amd.com>;
> david.h...@intel.com; jer...@marvell.com; harry.van.haa...@intel.com;
> dev@dpdk.org; Pavan Nikhilesh <pbhagavat...@marvell.com>; McDaniel, Timothy
> <timothy.mcdan...@intel.com>; Shijith Thotton <sthot...@marvell.com>;
> Hemant Agrawal <hemant.agra...@nxp.com>; Sachin Saxena
> <sachin.sax...@oss.nxp.com>; Mattias Rönnblom
> <mattias.ronnb...@ericsson.com>; Peter Mccarthy
> <peter.mccar...@intel.com>; Liang Ma <lian...@liangbit.com>
> Subject: Re: [RFC PATCH 1/5] eventdev: add power monitoring API on event port
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> On Wed, May 3, 2023 at 1:44 PM Ferruh Yigit <ferruh.yi...@amd.com> wrote:
> >
> > On 5/3/2023 8:58 AM, Jerin Jacob wrote:
> > > On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit <ferruh.yi...@amd.com> wrote:
> > >>
> > >> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
> > >>> On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit <ferruh.yi...@amd.com>
> wrote:
> > >>>>
> > >>>> On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> > >>>>> On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
> > >>>>> <sivaprasad.tumm...@amd.com> wrote:
> > >>>>>>
> > >>>>>> A new API to allow power monitoring condition on event port to
> > >>>>>> optimize power when no events are arriving on an event port for
> > >>>>>> the worker core to process in an eventdev based pipelined 
> > >>>>>> application.
> > >>>>>>
> > >>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tumm...@amd.com>
> > >>>>>> + *
> > >>>>>> + * @param dev_id
> > >>>>>> + *   Eventdev id
> > >>>>>> + * @param port_id
> > >>>>>> + *   Eventdev port id
> > >>>>>> + * @param pmc
> > >>>>>> + *   The pointer to power-optimized monitoring condition structure.
> > >>>>>> + *
> > >>>>>> + * @return
> > >>>>>> + *   - 0: Success.
> > >>>>>> + *   -ENOTSUP: Operation not supported.
> > >>>>>> + *   -EINVAL: Invalid parameters.
> > >>>>>> + *   -ENODEV: Invalid device ID.
> > >>>>>> + */
> > >>>>>> +__rte_experimental
> > >>>>>> +int
> > >>>>>> +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
> > >>>>>> +               struct rte_power_monitor_cond *pmc);
> > >>>>>
> > >>>>> + eventdev driver maintainers
> > >>>>>
> > >>>>> I think, we don't need to expose this application due to
> > >>>>> applications 1)To make applications to be transparent whether power
> saving is enabled or not?
> > >>>>> 2)Some HW and Arch already supports power managent in driver and
> > >>>>> in HW (Not using  CPU architecture directly)
> > >>>>>
> > >>>>> If so, that will be translated to following,
> > >>>>> a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id,
> > >>>>> uint8_t port_id, bool ena) for controlling power saving in slowpath.
> > >>>>> b) Create reusable PMD private function based on the CPU
> > >>>>> architecture power saving primitive to cover the PMD don't have
> > >>>>> native power saving support.
> > >>>>> c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
> > >>>>>
> > >>>>>
> > >>>>
> > >>>> Hi Jerin,
> > >>>
> > >>> Hi Ferruh,
> > >>>
> > >>>>
> > >>>> ethdev approach seems applied here.
> > >>>
> > >>> Understands that. But none of the NIC HW supports power management
> > >>> at HW level like eventdev, so that way for what we are doing for
> > >>> ethdev is a correct abstraction for ethdev.
> > >>>
> > >>
> > >> What I understand is there is HW based event manager and SW based
> > >> ones, SW based ones can benefit more from CPU power optimizations,
> > >> for HW event managers if there is not enough benefit they can just
> > >> ignore the feature.
> > >>
> > >>>>
> > >>>> In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
> > >>>> 'rte_eth_get_monitor_addr()'.
> > >>>>
> > >>>> Although 'rte_eth_get_monitor_addr()' is public API, it is
> > >>>> currently only called from Rx/Tx callback functions implemented in the
> power library.
> > >>>> But I assume intention to make it public is to enable users to
> > >>>> implement their own callback functions that has custom algorithm
> > >>>> for the power management.
> > >>>
> > >>> If there is a use case for customizing with own callback, we can provide
> that.
> > >>> Provided NULL is valid with default algorithm.
> > >>>
> > >>>>
> > >>>> And probably same is true for the 'rte_event_port_get_monitor_addr()'.
> > >>>>
> > >>>>
> > >>>> Also instead of implementing power features for withing PMDs,
> > >>>> isn't it better to have a common eventdev layer for it?
> > >>>
> > >>> We can have rte_evetdev_pmd_* APIs as non-public APIs.
> > >>> My only objection is to NOT introduce _monitor_ APIs at eventdev
> > >>> level, Instead, _monitor_ is one way to do it in SW, So we need
> > >>> higher level of abstraction.
> > >>>
> > >>
> > >> I see, this seems a trade off between flexibility and usability. If
> > >> application has access to _monitor_ APIs, they can be more flexible
> > >> to implement their own logic.
> > >
> > > OK.
> > >
> > >>
> > >> Another option can be application provides the policy with an API
> > >> and monitor API used to realize the policy, but for this case it
> > >> can be challenge to find and implement correct policies.
> > >
> > > OK. If we can enumerate the policies, then it will be ideal.
> > > On plus side, there will not be any changes in needed in lib/power/
> > >
> >
> > If we are talking about a power framework that user defines policies,
> > I expect parsing/defining policies will be in the power library and
> > will require changes in the power library anyway.
> 
> OK
> 
> >
> > But as mentioned above it is difficult to define a proper policy, this
> > is not really related to eventdev, more a power library issue. We can
> > continue to provide flexibility to user in eventdev and discuss the
> > policy if a wider forum.
> 
> OK.
> 
> >
> > >
> > >>
> > >>>>
> > >>>> For the PMDs benefit from HW event manager, just not implementing
> > >>>> .get_monitor_addr() dev_ops will make them free from power related
> APIs.
> > >>>
> > >>> But application fast path code gets diverged by exposing low level
> primitives.
> > >>>
> > >>
> > >> I am not clear with concern above, but for application that use
> > >> default callbacks, 'rte_power_eventdev_pmgmt_port_enable()' needs
> > >> to be called to enable this feature, if not called datapath is not 
> > >> impacted.
> > >> And if not dequeue callback added at all, custom or default, data
> > >> path is not impacted at all.
> > >
> > > Concerns are around following code[1] when callback is not
> > > registered for this use case.
> > > In eventdev, we are using _one packet at a time_ for a lot of use
> > > case with latency critical workload like L1 processing.
> > > On such cases, the following code will add up.
> > >
> > > [1]
> > >   cb = __atomic_load_n((void **)&fp_ops->ev_port.clbk[port_id],
> > >     __ATOMIC_RELAXED);
> > >   if (unlikely(cb != NULL))
> > >        nb_rx = rte_event_dequeue_callbacks(dev_id, port_id, ev,
> > > nb_rx, cb);
> > >
> > > I see two options,
> > > 1) Enumerate the power policy and let driver implement through
> > > non-public PMD helper functions OR 2)Move the power management
> > > callback to driver via non-public PMD helper functions to avoid cost
> > > of PMDs where power managment done in HW and to remove above extra
> > > check when NO callback is registered[1]
> > >
> >
> > Got it, yes there is an additional check with event callbacks, we can
> > add a compiler flag around it as done in ethdev to let it not compiled
> > when not needed, will it work?
> 
> I would prefer to expose PMD helper function which can be called at end of the
> driver dequeue function so that other PMD can reuse as needed.
> This is to avoid compiler flag, cache line occupancy changes in struct 
> rte_eventdev
> struct rte_event_fp_ops in generic code also we may not need full-fledged 
> generic
> callbacks scheme for this.
> 
> >
OK. Will fix this in the v1 patch.

Reply via email to