<snip> > > On Thu, May 28, 2020 at 9:08 PM Ananyev, Konstantin > <konstantin.anan...@intel.com> wrote: > > > > > > > > Hi Anatoly, > > > > > > > >> > > > >> Add two new power management intrinsics, and provide an > > > >> implementation in eal/x86 based on UMONITOR/UMWAIT instructions. > > > >> The instructions are implemented as raw byte opcodes because > > > >> there is not yet widespread compiler support for these instructions. > > > >> > > > >> The power management instructions provide an > > > >> architecture-specific function to either wait until a specified > > > >> TSC timestamp is reached, or optionally wait until either a TSC > > > >> timestamp is reached or a memory location is written to. The > > > >> monitor function also provides an optional comparison, to avoid > > > >> sleeping when the expected write has already happened, and no more > writes are expected. > > > > > > > > Recently ARM guys introduced new generic API for similar (as I > > > > understand) purposes: rte_wait_until_equal_(16|32|64). > > > > Probably would make sense to unite both APIs into something common > > > > and HW transparent. > > > > Konstantin > > > > > > Hi Konstantin, > > > > > > That's not really similar purpose. This is monitoring a cacheline > > > for writes, not waiting on a specific value. > > > > I understand that. > > > > > The "expected" value is there > > > as basically a hack to get around the race condition due to the fact > > > that by the time you enter monitoring state, the write you're > > > waiting for may have already happened. > > > > AFAIK, current rte_wait_until_equal_* does pretty much the same thing: > > > > LDXR memaddr, $reg // an address to monitor for if ($reg != > > expected_value) > > SEVL // arm monitor > > do { > > WFE // waits for write to that memory address > > LDXR memaddr, $reg > > } while ($reg != expected_value); > > > > Looks pretty similar to what rte_power_monitor() does, except you > > don't have a loop for checking the new value. > > Plus rte_power_monitor() provides extra options to the user - > > timestamp and power save mode to enter. > > Also I don't know what is the granularity of such events on ARM, is it > > a cache-line or more/less. > > As I understand it, Granularity is per the cache-line. > ie. Load-exclusive(LDXR) followed by WFE will wait in a low-power state until > the cache line is written. Architecture allows for 16B to 2048B space. Typically, implementations use cache-line granularity.
> > But I see UMONITOR bit different, Where _without_ other core signaling to > wakeup from wait state, it can wake on TSC expiry. I think, that's is the main > primitive on this feature. Right? > > WFE can also wake based on Timer stream events(kind of TSC in x86 > analogy) but it has a configuration > bit that needs to allow for this scheme in userspace(EL0) or not? > defined by EL1(Linux kernel). Timer stream events are not per CPU core. They are system wide streams. > I am planning to spend time on this after understanding the value addition of > the feature/usecase[1] [1] http://mails.dpdk.org/archives/dev/2020- > May/168888.html > > > > > > > Might be ARM people can comment/correct me here. > > Konstantin