On Wed, Jun 19, 2013 at 06:08:29PM +0100, Arjan van de Ven wrote: > On 6/19/2013 10:00 AM, Morten Rasmussen wrote: > > On Wed, Jun 19, 2013 at 04:39:39PM +0100, Arjan van de Ven wrote: > >> On 6/18/2013 10:47 AM, David Lang wrote: > >> > >>> > >>> It's bad enough trying to guess the needs of the processes, but if you > >>> also are reduced to guessing the capabilities of the cores, how can > >>> anything be made to work? > >> > >> btw one way to look at this is to assume that (with some minimal hinting) > >> the CPU driver will do the right thing and get you just about the best > >> performance you can get > >> (that is appropriate for the task at hand)... > >> ... and don't do anything in the scheduler proactively. > > > > If I understand correctly, you mean if your hardware/firmware is fully > > hardware, firmware and the driver > > > in control of the p-state selection and changes it fast enough to match > > the current load, the scheduler doesn't have to care? By fast enough I > > mean, faster than the scheduler would notice if a cpu was temporarily > > overloaded at a low p-state. In that case, you wouldn't need > > cpufreq/p-state hints, and the scheduler would only move tasks between > > cpus when cpus are fully loaded at their max p-state. > > with the migration hint, I'm pretty sure we'll be there today typically.
A hint when a task is moved to a new cpu is too late if the migration shouldn't have happened at all. If the scheduler knows that the cpu is able to switch to a higher p-state it can decide to wait for the p-state change instead of migrating the task and waking up another cpu. > we'll notice within 10 msec regardless, but the migration hint will take > the edge of those 10 msec normally. I'm not sure if 10 msec is fast enough for the scheduler to not notice. Real use-case studies will tell. > > I would argue that the "at their max p-state" in your sentence needs to go > away. > since you don't know what you actually are except in hindsight. > And even then you don't know if you could have gone higher or not. Yes. What I meant was that if your p-state selection is responsive enough the scheduler would only see the cpu as overloaded when it is in its highest available p-state. That may determined dynamically by power, thermal, and other factors. > > > >> the hints I have in mind are not all that complex; we have the biggest > >> issues today > >> around task migration (the task migrates to a cold cpu... so a simple > >> notifier chain > >> on the new cpu as it is accepting a task and we can bump it up), real time > >> tasks > >> (again, simple notifier chain to get you to a predictably high performance > >> level) > >> and we're a long way better than we are today in terms of actual problems. > >> > >> For all the talk of ondemand (as ARM still uses that today)... that guy > >> puts you in > >> either the lowest or highest frequency over 95% of the time. Other > >> non-cpufreq solutions > >> like on Intel are bit more advanced (and will grow more so over time), but > >> even there, > >> in the grand scheme of things, the scheduler shouldn't have to care > >> anymore with those > >> two notifiers in place. > > > > You would need more than a few hints to implement more advanced capacity > > management like proposed for the power scheduler. I believe that Intel > > would benefit as well from guiding the scheduler to idle the right cpu > > to enable deeper idle states and/or enable turbo-boost for other cpus. > > that's an interesting theory. > I've yet to see any way to actually have that do something useful. > > yes there is some value in grouping a lot of very short tasks together. > not a lot of value, but at least some. > > and there is some value in the grouping within a package (to a degree) thing. > > (both are basically "statistically, sort left" as policy) > The proposed task packing patches have shown significant benefits for scenarios with many short tasks. This is a typical scenario on android. Morten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/