Re: [dpdk-dev] [PATCH v1 1/7] power_intrinsics: allow monitor checks inversion

Ananyev, Konstantin Thu, 24 Jun 2021 08:26:12 -0700

> >>>>>> I did a quick prototype for this, and i don't think it is going to 
> >>>>>> work.
> >>>>>>
> >>>>>> Callbacks with just "current value" as argument will be pretty limited
> >>>>>> and will only really work for cases where we know what we are 
> >>>>>> expecting.
> >>>>>> However, for cases like event/dlb or net/mlx5, the expected value is 
> >>>>>> (or
> >>>>>> appears to be) dependent upon some internal device data, and is not
> >>>>>> constant like in case of net/ixgbe for example.
> >>>>>>
> >>>>>> This can be fixed by passing an opaque pointer, either by storing it in
> >>>>>> the monitor condition, or by passing it directly to rte_power_monitor 
> >>>>>> at
> >>>>>> invocation time.
> >>>>>>
> >>>>>> The latter doesn't work well because when we call rte_power_monitor 
> >>>>>> from
> >>>>>> inside the rte_power library, we lack the context necessary to get said
> >>>>>> opaque pointer.
> >>>>>>
> >>>>>> The former doesn't work either, because the only place where we can get
> >>>>>> this argument is inside get_monitor_addr, but the opaque pointer must
> >>>>>> persist after we exit that function in order to avoid use-after-free -
> >>>>>> which means that it either has to be statically allocated (which means
> >>>>>> it's not thread-safe for a non-trivial case), or dynamically allocated
> >>>>>> (which a big no-no on a hotpath).
> >>>>>
> >>>>> If I get you right, expected_value (and probably mask) can be variable 
> >>>>> ones.
> >>>>> So for callback approach to work we need to pass all this as parameters
> >>>>> to PMD comparison callback:
> >>>>> int pmc_callback(uint64_t real_val, uint64_t expected_val, uint64_t 
> >>>>> mask)
> >>>>> Correct?
> >>>>
> >>>> If we have both expected value, mask, and current value, then what's the
> >>>> point of the callback? The point of the callback would be to pass just
> >>>> the current value, and let the callback decide what's the expected value
> >>>> and how to compare it.
> >>>
> >>> For me the main point of callback is to hide PMD specific comparison 
> >>> semantics.
> >>> Basically they provide us with some values in struct 
> >>> rte_power_monitor_cond,
> >>> and then it is up to them how to interpret them in their comparison 
> >>> function.
> >>> All we'll do for them: will read the value at address provided.
> >>> I understand that it looks like an overkill, as majority of these 
> >>> comparison functions
> >>> will be like:
> >>> int cmp_callback(uint64_t real_val, uint64_t expected_val, uint64_t mask)
> >>> {
> >>>           return ((real_val & mask) == expected_val);
> >>> }
> >>> Though qsort() and bsearch() work in a similar manner, and everyone seems 
> >>> ok with it.
> >>>
> >>>>
> >>>> So, we can either let callback handle expected values itself by having
> >>>> an opaque callback-specific argument (which means it has to persist
> >>>> between .get_monitor_addr() and rte_power_monitor() calls),
> >>>
> >>> But that's what we doing already - PMD fills rte_power_monitor_cond values
> >>> for us, we store them somewhere and then use them to decide should we go 
> >>> to sleep or not.
> >>> All callback does - moves actual values interpretation back to PMD:
> >>> Right now:
> >>> PMD:      provide PMC values
> >>> POWER: store PMC values somewhere
> >>>                   read the value at address provided in PMC
> >>>                   interpret PMC values and newly read value and make the 
> >>> decision
> >>>
> >>> With callback:
> >>> PMD:      provide PMC values
> >>> POWER: store PMC values somewhere
> >>>                   read the value at address provided in PMC
> >>> PMD:      interpret PMC values and newly read value and make the decision
> >>>
> >>> Or did you mean something different here?
> >>>
> >>>> or we do the
> >>>> comparisons inside rte_power_monitor(), and store the expected/mask
> >>>> values in the monitor condition, and *don't* have any callbacks at all.
> >>>> Are you suggesting an alternative to the above two options?
> >>>
> >>> As I said in my first mail - we can just replace 'inverse' with 'op'.
> >>> That at least will make this API extendable, if someone will need
> >>> something different in future.
> >>>
> >>> Another option is
> >>
> >> Right, so the idea is store the PMD-specific data in the monitor
> >> condition, and leave it to the callback to interpret it.
> >>
> >> The obvious question then is, how many values is enough? Two? Three?
> >> Four? This option doesn't really solve the basic issue, it just kicks
> >> the can down the road. I guess three values should be enough for
> >> everyone (tm) ? :D
> >>
> >> I don't like the 'op' thing because if the goal is to be flexible, it's
> >> unnecessarily limiting *and* makes the API even more complex to use. I
> >> would rather have a number of PMD-specific values and leave it up to the
> >> callback to interpret them, because at least that way we're not limited
> >> to predefined operations on the monitor condition data.
> >
> > Just to make sure we are talking about the same, does what you propose
> > looks like that:
> >
> >   struct rte_power_monitor_cond {
> >          volatile void *addr;  /**< Address to monitor for changes */
> >          uint8_t size;    /**< Data size (in bytes) that will be used to 
> > compare
> >                            *   expected value (`val`) with data read from 
> > the
> >                            *   monitored memory location (`addr`). Can be 
> > 1, 2,
> >                            *   4, or 8. Supplying any other value will 
> > result in
> >                            *   an error.
> >                            */
> >          int (*cmp)(uint64_t real_value, const uint64_t opaque[4]);
> >          uint64_t opaque[4];  /*PMD specific data, used by comparison 
> > call-back below */
> > };
> >
> > And then in rte_power_monitor():
> > ...
> > uint64_t cur_value = __get_umwait_val(pmc->addr, pmc->size);
> > if (pmc->cmp(cur_value, pmc->opaque) != 0) {
> >      /* goto sleep */
> > }
> >
> > ?
> >
> 
> Something like that, yes.
>


Seems reasonable to me.
Thanks
Konstantin

Re: [dpdk-dev] [PATCH v1 1/7] power_intrinsics: allow monitor checks inversion

Reply via email to