Hi Patrick, On Fri, Jul 20, 2018 at 8:11 AM, Patrick Bellasi <patrick.bell...@arm.com> wrote: > Hi Suren, > thanks for the review, all good point... some more comments follow > inline. > > On 19-Jul 16:51, Suren Baghdasaryan wrote: >> On Mon, Jul 16, 2018 at 1:28 AM, Patrick Bellasi >> <patrick.bell...@arm.com> wrote: > > [...] > >> > +/** >> > + * uclamp_group_available: checks if a clamp group is available >> > + * @clamp_id: the utilization clamp index (i.e. min or max clamp) >> > + * @group_id: the group index in the given clamp_id >> > + * >> > + * A clamp group is not free if there is at least one SE which is sing a >> > clamp >> >> Did you mean to say "single clamp"? > > No, it's "...at least one SE which is USING a clamp value..." > >> > + * value mapped on the specified clamp_id. These SEs are reference >> > counted by >> > + * the se_count of a uclamp_map entry. >> > + * >> > + * Return: true if there are no SE's mapped on the specified clamp >> > + * index and group >> > + */ >> > +static inline bool uclamp_group_available(int clamp_id, int group_id) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + >> > + return (uc_map[group_id].value == UCLAMP_NONE); >> >> The usage of UCLAMP_NONE is very confusing to me. It was not used at >> all in the patch where it was introduced [1/12], here it's used as a >> clamp value and in uclamp_group_find() it's used as group_id. Please >> clarify the usage. > > Yes, it's meant to represent a "clamp not valid" condition, whatever > it's a "clamp group" or a "clamp value"... perhaps the name can be > improved. > >> I also feel UCLAMP_NONE does not really belong to >> the uclamp_id enum because other elements there are indexes in >> uclamp_maps and this one is a special value. > > Right, it looks a bit misplaced, I agree. I think I tried to set it > using a #define but there was some issues I don't remember now... > Anyway, I'll give it another go... > > >> IMHO if both *group_id* >> and *value* need a special value (-1) to represent >> unused/uninitialized entry it would be better to use different >> constants. Maybe UCLAMP_VAL_NONE and UCLAMP_GROUP_NONE? > > Yes, maybe we can use a > > #define UCLAMP_NOT_VALID -1 > > and get rid the confusing enum entry. > > Will update it on v3. >
Sounds good to me. >> > +} > > [...] > >> > +/** >> > + * uclamp_group_find: finds the group index of a utilization clamp group >> > + * @clamp_id: the utilization clamp index (i.e. min or max clamping) >> > + * @clamp_value: the utilization clamping value lookup for >> > + * >> > + * Verify if a group has been assigned to a certain clamp value and return >> > + * its index to be used for accounting. >> > + * >> > + * Since only a limited number of utilization clamp groups are allowed, >> > if no >> > + * groups have been assigned for the specified value, a new group is >> > assigned >> > + * if possible. Otherwise an error is returned, meaning that an >> > additional clamp >> > + * value is not (currently) supported. >> > + */ >> > +static int >> > +uclamp_group_find(int clamp_id, unsigned int clamp_value) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + int free_group_id = UCLAMP_NONE; >> > + unsigned int group_id = 0; >> > + >> > + for ( ; group_id <= CONFIG_UCLAMP_GROUPS_COUNT; ++group_id) { >> > + /* Keep track of first free clamp group */ >> > + if (uclamp_group_available(clamp_id, group_id)) { >> > + if (free_group_id == UCLAMP_NONE) >> > + free_group_id = group_id; >> > + continue; >> > + } >> > + /* Return index of first group with same clamp value */ >> > + if (uc_map[group_id].value == clamp_value) >> > + return group_id; >> > + } >> > + /* Default to first free clamp group */ >> > + if (group_id > CONFIG_UCLAMP_GROUPS_COUNT) >> >> Is the condition above needed? I think it's always true if you got here. >> Also AFAIKT after the for loop you can just do: >> >> return (free_group_id != UCLAMP_NONE) ? free_group_id : -ENOSPC; > > Yes, you right... the code above can be simplified! > >> >> > + group_id = free_group_id; >> > + /* All clamp group already track different clamp values */ >> > + if (group_id == UCLAMP_NONE) >> > + return -ENOSPC; >> > + return group_id; >> > +} > > [...] > >> > +static inline void uclamp_group_put(int clamp_id, int group_id) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + unsigned long flags; >> > + >> > + /* Ignore SE's not yet attached */ >> > + if (group_id == UCLAMP_NONE) >> > + return; >> > + >> > + /* Remove SE from this clamp group */ >> > + raw_spin_lock_irqsave(&uc_map[group_id].se_lock, flags); >> > + uc_map[group_id].se_count -= 1; >> >> If uc_map[group_id].se_count was 0 before decrement you end up with >> se_count == -1 and no reset for the element. > > Well... this should never happen, otherwise the refcounting is not > working as expected. > > Maybe we can add (at least) a debug check and warning, something like: > > #ifdef SCHED_DEBUG > if (unlikely(uc_map[group_id].se_count == 0)) { > WARN(1, "invalid clamp group [%d:%d] refcount\n", > clamp_id, group_id); > uc_map[group_id].se_count = 1; > } > #endif > >> > + if (uc_map[group_id].se_count == 0) >> > + uclamp_group_reset(clamp_id, group_id); >> > + raw_spin_unlock_irqrestore(&uc_map[group_id].se_lock, flags); >> > +} >> > + > > [...] > >> > static inline int __setscheduler_uclamp(struct task_struct *p, >> > const struct sched_attr *attr) >> > { >> > + struct uclamp_se *uc_se; >> > + int retval = 0; >> > + >> > if (attr->sched_util_min > attr->sched_util_max) >> > return -EINVAL; >> > if (attr->sched_util_max > SCHED_CAPACITY_SCALE) >> > return -EINVAL; >> > >> > - p->uclamp[UCLAMP_MIN] = attr->sched_util_min; >> > - p->uclamp[UCLAMP_MAX] = attr->sched_util_max; >> > + mutex_lock(&uclamp_mutex); >> > + >> > + /* Update min utilization clamp */ >> > + uc_se = &p->uclamp[UCLAMP_MIN]; >> > + retval |= uclamp_group_get(p, UCLAMP_MIN, uc_se, >> > + attr->sched_util_min); >> > + >> > + /* Update max utilization clamp */ >> > + uc_se = &p->uclamp[UCLAMP_MAX]; >> > + retval |= uclamp_group_get(p, UCLAMP_MAX, uc_se, >> > + attr->sched_util_max); >> > + >> > + mutex_unlock(&uclamp_mutex); >> > + >> > + /* >> > + * If one of the two clamp values should fail, >> > + * let the userspace know. >> > + */ >> > + if (retval) >> > + return -ENOSPC; >> >> Maybe a minor issue but this failure is ambiguous. It might mean: >> 1. no clamp value was updated >> 2. UCLAMP_MIN was updated but UCLAMP_MAX was not >> 3. UCLAMP_MAX was updated but UCLAMP_MIN was not > > That's right, I put a bit of thought on that me too but at the end > I've been convinced that the possibility to use a single syscall to > set both clamps at the same time has some benefits for user-space. > > Maybe the current solution can be improved by supporting an (optional) > strict semantic with an in-kernel roll-back in case one of the two > uclamp_group_get should fail. > > The strict semantic with roll-back could be controller via an > additional flag, e.g. SCHED_FLAG_UTIL_CLAMP_STRICT. > > When the flag is set, either we are able to set both the attributes or > we roll-back. Otherwise, when the flag is not set, we keep the current > behavior. i.e. we set what we can and report an error to notify > userspace that one constraints was not enforced. > > The following snippet should implement this semantics: > > ---8<--- > > /* Uclamp flags */ > #define SCHED_FLAG_UTIL_CLAMP_STRICT 0x11 /* Roll-back on failure */ > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x12 /* Update util_min */ > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x14 /* Update util_max */ > #define SCHED_FLAG_UTIL_CLAMP ( \ > SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX) > Having ability to update only min or only max this way might be indeed very useful. Instead of rolling back on failure I would suggest to check both inputs first to make sure there won't be any error before updating. This would remove the need for SCHED_FLAG_UTIL_CLAMP_STRICT (which I think any user would want to set to 1 anyway). Looks like uclamp_group_get() can fail only if uclamp_group_find() fails to find a slot for uclamp_value or a free slot. So one way to do this search before update is to call uclamp_group_find() for both UCLAMP_MIN and UCLAMP_MAX beforehand and if they succeed then pass obtained next_group_ids into uclamp_group_get() to avoid doing the same search twice. This requires some refactoring of uclamp_group_get() but I think the end result would be a cleaner and more predictable solution. > static inline int __setscheduler_uclamp(struct task_struct *p, > const struct sched_attr *attr) > { > unsigned int uclamp_value_old = 0; > unsigned int uclamp_value; > struct uclamp_se *uc_se; > int retval = 0; > > if (attr->sched_util_min > attr->sched_util_max) > return -EINVAL; > if (attr->sched_util_max > 100) > return -EINVAL; > > mutex_lock(&uclamp_mutex); > > if (!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)) > goto set_util_max; > > uc_se = &p->uclamp[UCLAMP_MIN]; > uclamp_value = scale_from_percent(attr->sched_util_min); > if (uc_se->value == uclamp_value) > goto set_util_max; > > /* Update min utilization clamp */ > uclamp_value_old = uc_se->value; > retval |= uclamp_group_get(p, NULL, UCLAMP_MIN, uc_se, uclamp_value); > if (retval && > attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_STRICT) > goto exit_failure; > > set_util_max: > > if (!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)) > goto exit_done; > > uc_se = &p->uclamp[UCLAMP_MAX]; > uclamp_value = scale_from_percent(attr->sched_util_max); > if (uc_se->value == uclamp_value) > goto exit_done; > > /* Update max utilization clamp */ > if (uclamp_group_get(p, NULL, UCLAMP_MAX, > uc_se, uclamp_value)) > goto exit_rollback; > > exit_done: > mutex_unlock(&uclamp_mutex); > return retval; > > exit_rollback: > if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN && > attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_STRICT) { > uclamp_group_get(p, NULL, UCLAMP_MIN, > uc_se, uclamp_value_old); > } > exit_failure: > mutex_unlock(&uclamp_mutex); > > return -ENOSPC; > } > > ---8<--- > > > Could that work better? > > The code is maybe a bit more convoluted... but perhaps it can be > improved by encoding it in a loop. > > > -- > #include <best/regards.h> > > Patrick Bellasi