On 12-Sep 18:12, Peter Zijlstra wrote: > On Wed, Sep 12, 2018 at 04:56:19PM +0100, Patrick Bellasi wrote: > > On 12-Sep 15:49, Peter Zijlstra wrote: > > > On Tue, Aug 28, 2018 at 02:53:10PM +0100, Patrick Bellasi wrote: > > > > > +/** > > > > + * uclamp_map: reference counts a utilization "clamp value" > > > > + * @value: the utilization "clamp value" required > > > > + * @se_count: the number of scheduling entities requiring the "clamp > > > > value" > > > > + * @se_lock: serialize reference count updates by protecting se_count > > > > > > Why do you have a spinlock to serialize a single value? Don't we have > > > atomics for that? > > > > There are some code paths where it's used to protect clamp groups > > mapping and initialization, e.g. > > > > uclamp_group_get() > > spin_lock() > > // initialize clamp group (if required) and then... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is actually a couple of function calls
> > se_count += 1 > > spin_unlock() > > > > Almost all these paths are triggered from user-space and protected > > by a global uclamp_mutex, but fork/exit paths. > > > > To serialize these paths I'm using the spinlock above, does it make > > sense ? Can we use the global uclamp_mutex on forks/exit too ? > > OK, then your comment is misleading; it serializes both fields. Yes... that definitively needs an update. > > One additional observations is that, if in the future we want to add a > > kernel space API, (e.g. driver asking for a new clamp value), maybe we > > will need to have a serialized non-sleeping uclamp_group_get() API ? > > No idea; but if you want to go all fancy you can replace he whole > uclamp_map thing with something like: > > struct uclamp_map { > union { > struct { > unsigned long v : 10; > unsigned long c : BITS_PER_LONG - 10; > }; > atomic_long_t s; > }; > }; That sounds really cool and scary at the same time :) The v:10 requires that we never set SCHED_CAPACITY_SCALE>1024 or that we use it to track a percentage value (i.e. [0..100]). One of the last patches introduces percentage values to userspace. But, I was considering that in kernel space we should always track full scale utilization values. The c:(BITS_PER_LONG-10) restricts the range of concurrently active SE refcounting the same clamp value. Which, for some 32bit systems is only 4 milions among tasks and cgroups... maybe still reasonable... > And use uclamp_map::c == 0 as unused (as per normal refcount > semantics) and atomic_long_cmpxchg() the whole thing using > uclamp_map::s. Yes... that could work for the uclamp_map updates, but as I noted above, I think I have other calls serialized by that lock... will look better into what you suggest, thanks! [...] > > > What's the purpose of that cacheline align statement? > > > > In uclamp_maps, we still need to scan the array when a clamp value is > > changed from user-space, i.e. the cases reported above. Thus, that > > alignment is just to ensure that we minimize the number of cache lines > > used. Does that make sense ? > > > > Maybe that alignment implicitly generated by the compiler ? > > It is not, but if it really is a slow path, we shouldn't care about > alignment. Ok, will remove it. > > > Note that without that apparently superfluous lock, it would be 8*12 = > > > 96 bytes, which is 1.5 lines and would indeed suggest you default to > > > GROUP_COUNT=7 by default to fill 2 lines. > > > > Yes, will check better if we can count on just the uclamp_mutex > > Well, if we don't care about performance (slow path) then keeping he > lock is fine, just the comment and alignment are misleading. Ok [...] Cheers, Patrick -- #include <best/regards.h> Patrick Bellasi