Re: [PATCH] sched: fix env->src_cpu for active migration

2013-02-14 Thread Vincent Guittot
On 13 February 2013 21:02, Damien Wyart  wrote:
>> Bingo, that was the problem in my setup: as the patch was applied
>> through a script with others, I had missed the error message about the
>> conflict (I have also another conflict which can be safely ignored so
>> the new one did not catch my eye)... So the patch was only
>> half-applied, and the final code is broken.
>
>> How did you solve the conflict (I am not a scheduler expert)? I can
>> retry running the patched kernel with your resolution, to check if
>> everything is ok.
>
> After looking a bit more, the conflict resolution seemed straighforward,

Yes the resolution is straightforward and your patch is ok.

> so I gave it a go. The attached version booted fine, so the initial
> problem was purely PEBCAK... Sorry for the noise!

Now, i know that my patch on frederic's branch, boots on a x86_64,
Core i7 920 :-)

Vincent

>
> --
> Damien

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH] sched: fix env->src_cpu for active migration

2013-02-14 Thread Vincent Guittot
On 13 February 2013 21:03, Paul Turner  wrote:
> On Tue, Feb 12, 2013 at 5:19 AM, Vincent Guittot
>  wrote:
>> need_active_balance uses env->src_cpu which is set only if there is more
>> than 1 task on the run queue.
>> We must set the src_cpu field unconditionnally
>> otherwise the test "env->src_cpu > env->dst_cpu" will always fail if there is
>> only 1 task on the run queue
>>
>> Signed-off-by: Vincent Guittot 
>> ---
>>  kernel/sched/fair.c |6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 81fa536..32938ea 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5044,6 +5044,10 @@ redo:
>>
>
> Aside:  Oh dear, it looks like in all the re-factoring here
> update_h_load() has escaped rq->lock?
>
> load_balance()
> ...
> update_h_load(env.src_cpu);
> more_balance:
> local_irq_save(flags);
> double_rq_lock(env.dst_rq, busiest);
>
>
>
>> ld_moved = 0;
>> lb_iterations = 1;
>> +
>> +   env.src_cpu   = busiest->cpu;
>> +   env.src_rq= busiest;
>> +
>
> Hmm -- But shouldn't find_busiest_queue return NULL in this case?
>
> We're admittedly racing but we should have:
>   busiest->nr_running == 1
>   wl = rq->load.weight > env->imbalance since imbalance < (wl -
> this_rq->load.weight / 2)

check_asym_packing overwrites the env->imbalance with (sds->max_load *
sds->busiest->sgp->power / SCHED_POWER_SCALE) so fbq can return a
queue

>
> This would seem specialized to the case when we either
>   A) race (fbq is openly racy)
>   B) have no capacity
>
> Admittedly when we do race current case this is probably a biased
> coinflip depending on whatever was on the stack (src_cpu is
> uninitialized); it's also super easy for this to later become a crash
> if someone tries to dereference src_rq so we should do something.
>
> The case this seems most important for (and something we should add an
> explicit comment regarding) is that we want this case specifically for
> CPU_NEW_IDLE to move a single runnable-task to a lower numbered
> idle-cpu index in the SD_ASYM case (although I suspect we need to push
> this up to fbq also to actually find it...)

The update of imbalance should be enough

>
> In the !SD_ASYM case we'd probably also want to  re-check busiest
> nr_running in need_active_balance (or probably better alternative
> re-arrange the checks) since this is going to potentially now move a
> single large task needlessly to an already loaded rq in balance-failed
> case.

yes, that could be an interesting add-on

>
>
>> if (busiest->nr_running > 1) {
>> /*
>>  * Attempt to move tasks. If find_busiest_group has found
>> @@ -5052,8 +5056,6 @@ redo:
>>  * correctly treated as an imbalance.
>>  */
>> env.flags |= LBF_ALL_PINNED;
>> -   env.src_cpu   = busiest->cpu;
>> -   env.src_rq= busiest;
>> env.loop_max  = min(sysctl_sched_nr_migrate, 
>> busiest->nr_running);
>>
>> update_h_load(env.src_cpu);
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev