> >>> > >>>> Previously, the semantics of power monitor were such that we were > >>>> checking current value against the expected value, and if they matched, > >>>> then the sleep was aborted. This is somewhat inflexible, because it only > >>>> allowed us to check for a specific value. > >>>> > >>>> This commit adds an option to reverse the check, so that we can have > >>>> monitor sleep aborted if the expected value *doesn't* match what's in > >>>> memory. This allows us to both implement all currently implemented > >>>> driver code, as well as support more use cases which don't easily map to > >>>> previous semantics (such as waiting on writes to AF_XDP counter value). > >>>> > >>>> Since the old behavior is the default, no need to adjust existing > >>>> implementations. > >>>> > >>>> Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com> > >>>> --- > >>>> lib/eal/include/generic/rte_power_intrinsics.h | 4 ++++ > >>>> lib/eal/x86/rte_power_intrinsics.c | 5 ++++- > >>>> 2 files changed, 8 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/lib/eal/include/generic/rte_power_intrinsics.h > >>>> b/lib/eal/include/generic/rte_power_intrinsics.h > >>>> index dddca3d41c..1006c2edfc 100644 > >>>> --- a/lib/eal/include/generic/rte_power_intrinsics.h > >>>> +++ b/lib/eal/include/generic/rte_power_intrinsics.h > >>>> @@ -31,6 +31,10 @@ struct rte_power_monitor_cond { > >>>> * 4, or 8. Supplying any other value will > >>>> result in > >>>> * an error. > >>>> */ > >>>> + uint8_t invert; /**< Invert check for expected value (e.g. > >>>> instead of > >>>> + * checking if `val` matches something, check if > >>>> + * `val` *doesn't* match a particular value) > >>>> + */ > >>>> }; > >>>> > >>>> /** > >>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c > >>>> b/lib/eal/x86/rte_power_intrinsics.c > >>>> index 39ea9fdecd..5d944e9aa4 100644 > >>>> --- a/lib/eal/x86/rte_power_intrinsics.c > >>>> +++ b/lib/eal/x86/rte_power_intrinsics.c > >>>> @@ -117,7 +117,10 @@ rte_power_monitor(const struct > >>>> rte_power_monitor_cond *pmc, > >>>> const uint64_t masked = cur_value & pmc->mask; > >>>> > >>>> /* if the masked value is already matching, abort */ > >>>> - if (masked == pmc->val) > >>>> + if (!pmc->invert && masked == pmc->val) > >>>> + goto end; > >>>> + /* same, but for inverse check */ > >>>> + if (pmc->invert && masked != pmc->val) > >>>> goto end; > >>>> } > >>>> > >>> > >>> Hmm..., such approach looks too 'patchy'... > >>> Can we at least replace 'inver' with something like: > >>> enum rte_power_monitor_cond_op { > >>> _EQ, NEQ,... > >>> }; > >>> Then at least new comparions ops can be added in future. > >>> Even better I think would be to just leave to PMD to provide a comparison > >>> callback. > >>> Will make things really simple and generic: > >>> struct rte_power_monitor_cond { > >>> volatile void *addr; > >>> int (*cmp)(uint64_t val); > >>> uint8_t size; > >>> }; > >>> And then in rte_power_monitor(...): > >>> .... > >>> const uint64_t cur_value = __get_umwait_val(pmc->addr, pmc->size); > >>> if (pmc->cmp(cur_value) != 0) > >>> goto end; > >>> .... > >>> > >> > >> I like the idea of a callback, but these are supposed to be > >> intrinsic-like functions, so putting too much into them is contrary to > >> their goal, and it's going to make the API hard to use in simpler cases > >> (e.g. when we're explicitly calling rte_power_monitor as opposed to > >> letting the RX callback do it for us). For example, event/dlb code calls > >> rte_power_monitor explicitly. > > > > Good point, I didn't know that. > > Would be interesting to see how do they use it. > > To be fair, it should be possible to rewrite their code using a > callback. Perhaps adding a (void *) parameter for any custom data > related to the callback (because C doesn't have closures...), but > otherwise it should be doable, so the question isn't that it's > impossible to rewrite event/dlb code to use callbacks, it's more of an > issue with complicating usage of already-not-quite-straightforward API > even more. > > > > >> > >> It's going to be especially "fun" to do these indirect function calls > >> from inside transactional region on call to multi-monitor. > > > > But the callback is not supposed to do any memory reads/writes. > > Just mask/compare of the provided value with some constant. > > Yeah, but with callbacks we can't really control that, can we? I mean i > guess a *sane* implementation wouldn't do that, but still, it's > theoretically possible to perform more complex checks and even touch > some unrelated data in the process.
Yep, PMD developer can ignore recommendations and do whatever he wants in the call-back. We can't control it. If he touches some memory in it - probably there will be more spurious wakeups and less power saves. In principle it is the same with all other PMD dev-ops - we have to trust that they are doing what they have to. > > > > >> I'm not > >> opposed to having a callback here, but maybe others have more thoughts > >> on this? > >> > >> -- > >> Thanks, > >> Anatoly > > > -- > Thanks, > Anatoly