> >>>
> >>>> Previously, the semantics of power monitor were such that we were
> >>>> checking current value against the expected value, and if they matched,
> >>>> then the sleep was aborted. This is somewhat inflexible, because it only
> >>>> allowed us to check for a specific value.
> >>>>
> >>>> This commit adds an option to reverse the check, so that we can have
> >>>> monitor sleep aborted if the expected value *doesn't* match what's in
> >>>> memory. This allows us to both implement all currently implemented
> >>>> driver code, as well as support more use cases which don't easily map to
> >>>> previous semantics (such as waiting on writes to AF_XDP counter value).
> >>>>
> >>>> Since the old behavior is the default, no need to adjust existing
> >>>> implementations.
> >>>>
> >>>> Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com>
> >>>> ---
> >>>>    lib/eal/include/generic/rte_power_intrinsics.h | 4 ++++
> >>>>    lib/eal/x86/rte_power_intrinsics.c             | 5 ++++-
> >>>>    2 files changed, 8 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/lib/eal/include/generic/rte_power_intrinsics.h 
> >>>> b/lib/eal/include/generic/rte_power_intrinsics.h
> >>>> index dddca3d41c..1006c2edfc 100644
> >>>> --- a/lib/eal/include/generic/rte_power_intrinsics.h
> >>>> +++ b/lib/eal/include/generic/rte_power_intrinsics.h
> >>>> @@ -31,6 +31,10 @@ struct rte_power_monitor_cond {
> >>>>                           *   4, or 8. Supplying any other value will 
> >>>> result in
> >>>>                           *   an error.
> >>>>                           */
> >>>> +     uint8_t invert;  /**< Invert check for expected value (e.g. 
> >>>> instead of
> >>>> +                       *   checking if `val` matches something, check if
> >>>> +                       *   `val` *doesn't* match a particular value)
> >>>> +                       */
> >>>>    };
> >>>>
> >>>>    /**
> >>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c 
> >>>> b/lib/eal/x86/rte_power_intrinsics.c
> >>>> index 39ea9fdecd..5d944e9aa4 100644
> >>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> >>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> >>>> @@ -117,7 +117,10 @@ rte_power_monitor(const struct 
> >>>> rte_power_monitor_cond *pmc,
> >>>>                 const uint64_t masked = cur_value & pmc->mask;
> >>>>
> >>>>                 /* if the masked value is already matching, abort */
> >>>> -             if (masked == pmc->val)
> >>>> +             if (!pmc->invert && masked == pmc->val)
> >>>> +                     goto end;
> >>>> +             /* same, but for inverse check */
> >>>> +             if (pmc->invert && masked != pmc->val)
> >>>>                         goto end;
> >>>>         }
> >>>>
> >>>
> >>> Hmm..., such approach looks too 'patchy'...
> >>> Can we at least replace 'inver' with something like:
> >>> enum rte_power_monitor_cond_op {
> >>>           _EQ, NEQ,...
> >>> };
> >>> Then at least new comparions ops can be added in future.
> >>> Even better I think would be to just leave to PMD to provide a comparison 
> >>> callback.
> >>> Will make things really simple and generic:
> >>> struct rte_power_monitor_cond {
> >>>        volatile void *addr;
> >>>        int (*cmp)(uint64_t val);
> >>>        uint8_t size;
> >>> };
> >>> And then in rte_power_monitor(...):
> >>> ....
> >>> const uint64_t cur_value = __get_umwait_val(pmc->addr, pmc->size);
> >>> if (pmc->cmp(cur_value) != 0)
> >>>           goto end;
> >>> ....
> >>>
> >>
> >> I like the idea of a callback, but these are supposed to be
> >> intrinsic-like functions, so putting too much into them is contrary to
> >> their goal, and it's going to make the API hard to use in simpler cases
> >> (e.g. when we're explicitly calling rte_power_monitor as opposed to
> >> letting the RX callback do it for us). For example, event/dlb code calls
> >> rte_power_monitor explicitly.
> >
> > Good point, I didn't know that.
> > Would be interesting to see how do they use it.
> 
> To be fair, it should be possible to rewrite their code using a
> callback. Perhaps adding a (void *) parameter for any custom data
> related to the callback (because C doesn't have closures...), but
> otherwise it should be doable, so the question isn't that it's
> impossible to rewrite event/dlb code to use callbacks, it's more of an
> issue with complicating usage of already-not-quite-straightforward API
> even more.
> 
> >
> >>
> >> It's going to be especially "fun" to do these indirect function calls
> >> from inside transactional region on call to multi-monitor.
> >
> > But the callback is not supposed to do any memory reads/writes.
> > Just mask/compare of the provided value with some constant.
> 
> Yeah, but with callbacks we can't really control that, can we? I mean i
> guess a *sane* implementation wouldn't do that, but still, it's
> theoretically possible to perform more complex checks and even touch
> some unrelated data in the process.

Yep, PMD developer can ignore recommendations and do whatever
he wants in the call-back. We can't control it.
If he touches some memory in it - probably there will be more spurious wakeups 
and less power saves.
In principle it is the same with all other PMD dev-ops - we have to trust that 
they are
doing what they have to.  

> 
> >
> >> I'm not
> >> opposed to having a callback here, but maybe others have more thoughts
> >> on this?
> >>
> >> --
> >> Thanks,
> >> Anatoly
> 
> 
> --
> Thanks,
> Anatoly

Reply via email to