Re: [Xen-devel] [PATCH] xen/sched: fix locking in sched_tick_[suspend|resume]()

George Dunlap Fri, 04 Oct 2019 08:38:22 -0700

On 10/4/19 4:03 PM, Jürgen Groß wrote:
> On 04.10.19 16:56, George Dunlap wrote:
>> On 10/4/19 3:43 PM, Jürgen Groß wrote:
>>> On 04.10.19 16:34, George Dunlap wrote:
>>>> On 10/4/19 3:24 PM, Jürgen Groß wrote:
>>>>> On 04.10.19 16:08, George Dunlap wrote:
>>>>>> On 10/4/19 7:40 AM, Juergen Gross wrote:
>>>>>>> sched_tick_suspend() and sched_tick_resume() should not call the
>>>>>>> scheduler specific timer handlers in case the cpu they are
>>>>>>> running on
>>>>>>> is just being moved to or from a cpupool.
>>>>>>>
>>>>>>> Use a new percpu lock for that purpose.
>>>>>>
>>>>>> Is there a reason we can't use the pcpu_schedule_lock() instead of
>>>>>> introducing a new one?  Sorry if this is obvious, but it's been a
>>>>>> while
>>>>>> since I poked around this code.
>>>>>
>>>>> Lock contention would be higher especially with credit2 which is
>>>>> using a
>>>>> per-core or even per-socket lock. We don't care about other scheduling
>>>>> activity here, all we need is a guard against our per-cpu scheduler
>>>>> data being changed beneath our feet.
>>>>
>>>> Is this code really being called so often that we need to worry about
>>>> this level of contention?
>>>
>>> Its called each time idle is entered and left again.
>>>
>>> Especially with core scheduling there is a high probability of multiple
>>> cpus leaving idle at the same time and the per-scheduler lock being used
>>> in parallel already.
>>
>> Hrm, that does sound pretty bad.
>>
>>>> We already have a *lot* of locks; and in this case you're adding a
>>>> second lock which interacts with the per-scheduler cpu lock.  This just
>>>> seems like asking for trouble.
>>>
>>> In which way does it interact with the per-scheduler cpu lock?
>>>
>>>> I won't Nack the patch, but I don't think I would ack it without clear
>>>> evidence that the extra lock has a performance improvement that's worth
>>>> the cost of the extra complexity.
>>>
>>> I think complexity is lower this way. Especially considering the per-
>>> scheduler lock changing with moving a cpu to or from a cpupool.
>>
>> The key aspect of the per-scheduler lock is that once you hold it, the
>> pointer to the lock can't change.
>>
>> After this patch, the fact remains that sometimes you need to grab one
>> lock, sometimes the other, and sometimes both.
>>
>> And, tick_suspend() lives in the per-scheduler code.  Each scheduler has
>> to remember that tick_suspend and tick_resume hold a completely
>> different lock to the rest of the scheduling functions.
> 
> Is that really so critical? Today only credit1 has tick_suspend and
> tick_resume hooks, and both are really very simple. I can add a
> comment in sched-if.h if you like.
> 
> And up to now there was no lock at all involved when calling them...
> 
> If you think using the normal scheduler lock is to be preferred I'd
> be happy to change the patch. But I should mention I was already
> planning to revisit usage of the scheduler lock and replace it by the
> new per-cpu lock where appropriate (not sure I'd find any appropriate
> path for replacement).


Well the really annoying thing here is that all the other schedulers --
in particular, credit2, which as you say, is designed to have multiple
runqueues share the same lock -- have to grab & release the lock just to
find out that there's nothing to do.

And even credit1 doesn't do anything particularly clever -- all it does
is stop and start a timer based on a scheduler-global configuration. And
the scheduling lock is grabbed to switch to idle anyway.  It seems like
we should be able to do something more sensible.

/me is still thinking

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/sched: fix locking in sched_tick_[suspend|resume]()

Reply via email to