On Wed, 2016-05-11 at 03:16 +0800, Yuyang Du wrote:
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3027,6 +3027,9 @@ void remove_entity_load_avg(struct sched
> >
> > static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq)
> > {
> > +> >> > if (s
On Wed, 2016-05-11 at 09:23 +0800, Yuyang Du wrote:
> > Yeah, just like everything else, it'll cuts both ways (why you can't
> > win the sched game). If I can believe tbench, at tasks=cpus, reducing
> > lag increased utilization and reduced latency a wee bit, as did the
> > reserve thing once a b
On Wed, May 11, 2016 at 06:17:51AM +0200, Mike Galbraith wrote:
> > > static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq
> > > *cfs_rq)
> > > {
> > > +> > > > if (sched_feat(LB_TIP_AVG_HIGH) && cfs_rq->load.weight >
> > > cfs_rq->runnable_load_avg*2)
> > > +> > > >
On Wed, 2016-05-11 at 03:16 +0800, Yuyang Du wrote:
> On Tue, May 10, 2016 at 05:26:05PM +0200, Mike Galbraith wrote:
> > On Tue, 2016-05-10 at 09:49 +0200, Mike Galbraith wrote:
> >
> > > Only whacking
> > > cfs_rq_runnable_load_avg() with a rock makes schbench -m -t
> > > -a work well. 'Cour
On Tue, May 10, 2016 at 05:26:05PM +0200, Mike Galbraith wrote:
> On Tue, 2016-05-10 at 09:49 +0200, Mike Galbraith wrote:
>
> > Only whacking
> > cfs_rq_runnable_load_avg() with a rock makes schbench -m -t
> > -a work well. 'Course a rock in its gearbox also
> > rendered load balancing fairly
On Tue, 2016-05-10 at 09:49 +0200, Mike Galbraith wrote:
> Only whacking
> cfs_rq_runnable_load_avg() with a rock makes schbench -m -t
> -a work well. 'Course a rock in its gearbox also
> rendered load balancing fairly busted for the general case :)
Smaller rock doesn't injure heavy tbench, b
On Tue, 2016-05-10 at 07:26 +0800, Yuyang Du wrote:
> By cpu reservation, you mean the various averages in select_task_rq_fair?
> It does seem a lot of cleanup should be done.
Nah, I meant claiming an idle cpu with cmpxchg(). It's mostly the
average load business that leads to premature stacking
On Mon, May 09, 2016 at 11:39:05AM +0200, Mike Galbraith wrote:
> On Mon, 2016-05-09 at 09:13 +0800, Yuyang Du wrote:
> > On Mon, May 09, 2016 at 09:44:13AM +0200, Mike Galbraith wrote:
>
> > > In a perfect world, running only Chris' benchmark on an otherwise idle
> > > box, there would never _be_
On Mon, 2016-05-09 at 09:13 +0800, Yuyang Du wrote:
> On Mon, May 09, 2016 at 09:44:13AM +0200, Mike Galbraith wrote:
> > In a perfect world, running only Chris' benchmark on an otherwise idle
> > box, there would never _be_ any work to steal.
>
> What is the perfect world like? I don't get what
On Mon, 2016-05-09 at 10:33 +0200, Peter Zijlstra wrote:
> On Fri, May 06, 2016 at 08:54:38PM +0200, Mike Galbraith wrote:
> > master
> > Throughput 2722.43 MB/sec 4 clients 4 procs max_latency=2.400 ms
>
> > echo NO_IDLE_SIBLING > /sys/kernel/debug/sched_features
> > Throughput 3484.18 MB/sec
On Mon, May 09, 2016 at 09:44:13AM +0200, Mike Galbraith wrote:
> > Then a valid question is whether it is this selection screwed up in case
> > like this, as it should necessarily always be asked.
>
> That's a given, it's just a question of how to do a bit better cheaply.
>
> > > > Regarding wa
On Fri, May 06, 2016 at 08:54:38PM +0200, Mike Galbraith wrote:
> master
> Throughput 2722.43 MB/sec 4 clients 4 procs max_latency=2.400 ms
> echo NO_IDLE_SIBLING > /sys/kernel/debug/sched_features
> Throughput 3484.18 MB/sec 4 clients 4 procs max_latency=3.430 ms
Yeah, I know about that bu
On Mon, 2016-05-09 at 04:22 +0800, Yuyang Du wrote:
> On Mon, May 09, 2016 at 05:45:40AM +0200, Mike Galbraith wrote:
> > On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> > > On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > > > > Maybe give the criteria a bit margin, not jus
On Mon, May 09, 2016 at 05:52:51AM +0200, Mike Galbraith wrote:
> On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
>
> > In addition, I would argue maybe beefing up idle balancing is a more
> > productive way to spread load, as work-stealing just does what needs
> > to be done. And seems it has
On Mon, May 09, 2016 at 05:45:40AM +0200, Mike Galbraith wrote:
> On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> > On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > > > Maybe give the criteria a bit margin, not just wakees tend to equal
> > > > llc_size,
> > > > but the nu
On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> In addition, I would argue maybe beefing up idle balancing is a more
> productive way to spread load, as work-stealing just does what needs
> to be done. And seems it has been (sub-unconsciously) neglected in this
> case, :)
P.S. Nope, I'm din
On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > > Maybe give the criteria a bit margin, not just wakees tend to equal
> > > llc_size,
> > > but the numbers are so wild to easily break the fragile condition, like:
> >
> > Se
On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > Maybe give the criteria a bit margin, not just wakees tend to equal
> > llc_size,
> > but the numbers are so wild to easily break the fragile condition, like:
>
> Seems lockless traversal and averages just lets multiple CPUs sele
On Sat, 2016-05-07 at 09:24 +0800, Yuyang Du wrote:
> On Sun, May 01, 2016 at 11:20:25AM +0200, Mike Galbraith wrote:
> > Playing with Chris' benchmark, seems the biggest problem is that we
> > don't buddy up waker of many and it's wakees in a node.. ie the wake
> > wide thing isn't necessarily ou
On Sun, May 01, 2016 at 11:20:25AM +0200, Mike Galbraith wrote:
> On Sun, 2016-05-01 at 10:53 +0200, Peter Zijlstra wrote:
> > On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote:
> > > On Sat, 2016-04-30 at 14:47 +0200, Peter Zijlstra wrote:
> >
> > > > Can you guys have a play with th
On Thu, 2016-05-05 at 23:03 +0100, Matt Fleming wrote:
> One thing I haven't yet done is twiddled the bits individually to see
> what the best combination is. Have you settled on the right settings
> yet?
Lighter configs, revert sched/fair: Fix fairness issue on migration,
twiddle knobs. Added a
On Fri, May 06, 2016 at 09:12:51AM +0200, Peter Zijlstra wrote:
> ontent-Length: 973
>
> On Thu, May 05, 2016 at 09:58:44AM -0400, Chris Mason wrote:
> > > I'll try and have a prod at the program itself if you have no pending
> > > changes on your end.
> >
> > Sorry, I don't. Look at sleep_for_r
On Tue, May 03, 2016 at 11:11:53AM -0400, Chris Mason wrote:
> # pick a single core, in my case cpus 0,20 are the same core
> # cpu_hog is any program that spins
> #
> taskset -c 20 cpu_hog &
>
> # schbench -p 4 means message passing mode with 4 byte messages (like
> # pipe test), no sleeps, just
On Thu, May 05, 2016 at 09:58:44AM -0400, Chris Mason wrote:
> > I'll try and have a prod at the program itself if you have no pending
> > changes on your end.
>
> Sorry, I don't. Look at sleep_for_runtime() and how I test/set the
> global stopping variable in different places. I've almost certa
On Wed, 04 May, at 12:37:01PM, Peter Zijlstra wrote:
>
> tbench wants select_idle_siblings() to just not exist; it goes happy
> when you just return target.
I've been playing with this patch a little bit by hitting it with
tbench on a Xeon, 12 cores with HT enabled, 2 sockets (48 cpus).
I see a
On Thu, May 05, 2016 at 11:33:38AM +0200, Peter Zijlstra wrote:
> On Wed, May 04, 2016 at 01:46:16PM -0400, Chris Mason wrote:
> > It should, make sure you're at the top commit in git.
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mason/schbench.git
>
> I did double check; I am on the top
On Wed, May 04, 2016 at 01:46:16PM -0400, Chris Mason wrote:
> It should, make sure you're at the top commit in git.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/schbench.git
I did double check; I am on the top commit of that. I refetched and
rebuild just to make tripple sure.
> It's
On Wed, May 04, 2016 at 05:45:10PM +0200, Peter Zijlstra wrote:
> On Tue, May 03, 2016 at 11:11:53AM -0400, Chris Mason wrote:
> > # pick a single core, in my case cpus 0,20 are the same core
> > # cpu_hog is any program that spins
> > #
> > taskset -c 20 cpu_hog &
> >
> > # schbench -p 4 means me
On Tue, May 03, 2016 at 11:11:53AM -0400, Chris Mason wrote:
> # pick a single core, in my case cpus 0,20 are the same core
> # cpu_hog is any program that spins
> #
> taskset -c 20 cpu_hog &
>
> # schbench -p 4 means message passing mode with 4 byte messages (like
> # pipe test), no sleeps, just
On Wed, May 04, 2016 at 12:37:01PM +0200, Peter Zijlstra wrote:
> +static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd,
> int target)
> +{
> + struct sched_domain *this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
> + u64 time, cost;
> + s64 delta;
> + int c
On Tue, May 03, 2016 at 11:11:53AM -0400, Chris Mason wrote:
> > + if (cpu_rq(cpu)->cfs.nr_running > 1)
> > + return 1;
>
> The nr_running check is interesting. It is supposed to give the same
> benefit as your "do we have anything idle?" variable, but without having
> to con
On Tue, May 03, 2016 at 01:31:31PM +0200, Peter Zijlstra wrote:
> Then flip on the last_idle tracking in select_idle_core():
>
> root@ivb-ep:~/bench/sysbench# for i in NO_OLD_IDLE NO_ORDER_IDLE IDLE_CORE
> NO_FORCE_CORE IDLE IDLE_SMT IDLE_LAST NO_IDLE_FIRST ; do echo $i >
> /debug/sched_features
On Tue, May 03, 2016 at 04:32:25PM +0200, Peter Zijlstra wrote:
> On Mon, May 02, 2016 at 11:47:25AM -0400, Chris Mason wrote:
> > On Mon, May 02, 2016 at 04:58:17PM +0200, Peter Zijlstra wrote:
> > > On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> > > > Oh btw, did you know singl
On Mon, May 02, 2016 at 11:47:25AM -0400, Chris Mason wrote:
> On Mon, May 02, 2016 at 04:58:17PM +0200, Peter Zijlstra wrote:
> > On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> > > Oh btw, did you know single socket boxen have no sd_busy? That doesn't
> > > look right.
> >
> >
On Mon, May 02, 2016 at 06:04:04PM +0200, Ingo Molnar wrote:
> * Peter Zijlstra wrote:
> > If you want a laugh, modify select_idle_core() to remember the last idle
> > thread it encounters and have it return that when it fails to find an
> > idle core.. I'm still stumped to explain why it behaves
On Mon, 2016-05-02 at 16:58 +0200, Peter Zijlstra wrote:
> On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> > Oh btw, did you know single socket boxen have no sd_busy? That
> > doesn't
> > look right.
>
> I suspected; didn't bother looking at yet. The 'problem' is that the
> LLC
* Peter Zijlstra wrote:
> On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> > On Mon, 2016-05-02 at 10:46 +0200, Peter Zijlstra wrote:
> > 5226 /*
> > 5227 * If there are idle cores to be had, go find one.
> > 5228 */
> > 5229 if (sched_feat(IDLE
On Mon, May 02, 2016 at 04:58:17PM +0200, Peter Zijlstra wrote:
> On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> > Oh btw, did you know single socket boxen have no sd_busy? That doesn't
> > look right.
>
> I suspected; didn't bother looking at yet. The 'problem' is that the LLC
On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> Order is one thing, but what the old behavior does first and foremost
> is when the box starts getting really busy, only looking at target's
> sibling shuts select_idle_sibling() down instead of letting it wreck
> things. Once core
On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> On Mon, 2016-05-02 at 10:46 +0200, Peter Zijlstra wrote:
> 5226 /*
> 5227 * If there are idle cores to be had, go find one.
> 5228 */
> 5229 if (sched_feat(IDLE_CORE) && test_idle_cores(target)) {
>
On Mon, May 02, 2016 at 04:50:04PM +0200, Mike Galbraith wrote:
> Oh btw, did you know single socket boxen have no sd_busy? That doesn't
> look right.
I suspected; didn't bother looking at yet. The 'problem' is that the LLC
domain is the top-most, so it doesn't have a parent domain. I'm sure we
c
On Mon, 2016-05-02 at 10:46 +0200, Peter Zijlstra wrote:
> On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote:
>
> > Nah, tbench is just variance prone. It got dinged up at clients=cores
> > on my desktop box, on 4 sockets the high end got seriously dinged up.
>
>
> Ha!, check this:
On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote:
> Nah, tbench is just variance prone. It got dinged up at clients=cores
> on my desktop box, on 4 sockets the high end got seriously dinged up.
Ha!, check this:
root@ivb-ep:~# echo OLD_IDLE > /debug/sched_features ; echo
NO_ORDER_
On Sun, 2016-05-01 at 10:53 +0200, Peter Zijlstra wrote:
> On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote:
> > On Sat, 2016-04-30 at 14:47 +0200, Peter Zijlstra wrote:
>
> > > Can you guys have a play with this; I think one and two node tbench are
> > > good, but I seem to be getti
On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote:
> On Sat, 2016-04-30 at 14:47 +0200, Peter Zijlstra wrote:
> > Can you guys have a play with this; I think one and two node tbench are
> > good, but I seem to be getting significant run to run variance on that,
> > so maybe I'm not do
On Sat, 2016-04-30 at 14:47 +0200, Peter Zijlstra wrote:
> On Sat, Apr 09, 2016 at 03:05:54PM -0400, Chris Mason wrote:
> > select_task_rq_fair() can leave cpu utilization a little lumpy,
> > especially as the workload ramps up to the maximum capacity of the
> > machine. The end result can be high
On Sat, Apr 09, 2016 at 03:05:54PM -0400, Chris Mason wrote:
> select_task_rq_fair() can leave cpu utilization a little lumpy,
> especially as the workload ramps up to the maximum capacity of the
> machine. The end result can be high p99 response times as apps
> wait to get scheduled, even when bo
Another thing you could try is looking at your avg_idle, and twiddling
sched_migration_cost_ns to crank up idle balancing a bit.
-Mike
On Wed, 2016-04-13 at 10:36 -0400, Chris Mason wrote:
> On Wed, Apr 13, 2016 at 04:22:58PM +0200, Mike Galbraith wrote:
> > What exactly do you mean by failed affine wakeups? Failed because
> > wake_wide() said we don't want one, or because wake_affine() said we
> > can't have one? If the later,
On Wed, Apr 13, 2016 at 04:22:58PM +0200, Mike Galbraith wrote:
> On Wed, 2016-04-13 at 09:44 -0400, Chris Mason wrote:
>
> > So you're interested in numbers where we pass the wake_wide decision
> > into select_idle_sibling(), and then use that instead of (or in addition
> > to?) my should_scan_id
On Wed, 2016-04-13 at 09:44 -0400, Chris Mason wrote:
> So you're interested in numbers where we pass the wake_wide decision
> into select_idle_sibling(), and then use that instead of (or in addition
> to?) my should_scan_idle() function?
Yeah, I was thinking instead of, and hoping that would be
On Wed, Apr 13, 2016 at 05:18:51AM +0200, Mike Galbraith wrote:
> On Tue, 2016-04-12 at 16:07 -0400, Chris Mason wrote:
>
> > I think that if we're worried about the cost of the idle scan for this
> > workload, find_idlest_group() is either going to hurt much more, or not
> > search enough CPUs to
On Tue, 2016-04-12 at 16:07 -0400, Chris Mason wrote:
> I think that if we're worried about the cost of the idle scan for this
> workload, find_idlest_group() is either going to hurt much more, or not
> search enough CPUs to find the idle one.
find_idlest_group()? No no no, that's not what I mea
On Tue, Apr 12, 2016 at 08:16:17PM +0200, Mike Galbraith wrote:
> On Tue, 2016-04-12 at 09:27 -0400, Chris Mason wrote:
> > I
> > can always add the tunable to flip things on/off but I'd prefer that we
> > find a good set of defaults, mostly so the FB production runtime is the
> > common config ins
On Tue, 2016-04-12 at 09:27 -0400, Chris Mason wrote:
> I
> can always add the tunable to flip things on/off but I'd prefer that we
> find a good set of defaults, mostly so the FB production runtime is the
> common config instead of the special snowflake.
Yeah, generic has a much better chance to
On Tue, Apr 12, 2016 at 06:44:08AM +0200, Mike Galbraith wrote:
> On Mon, 2016-04-11 at 20:30 -0400, Chris Mason wrote:
> > On Mon, Apr 11, 2016 at 06:54:21AM +0200, Mike Galbraith wrote:
>
> > > > Ok, I was able to reproduce this by stuffing tbench_srv and tbench onto
> > > > just socket 0. Vers
On Mon, 2016-04-11 at 20:30 -0400, Chris Mason wrote:
> On Mon, Apr 11, 2016 at 06:54:21AM +0200, Mike Galbraith wrote:
> > > Ok, I was able to reproduce this by stuffing tbench_srv and tbench onto
> > > just socket 0. Version 2 below fixes things for me, but I'm hoping
> > > someone can suggest
On Mon, Apr 11, 2016 at 06:54:21AM +0200, Mike Galbraith wrote:
> On Sun, 2016-04-10 at 15:55 -0400, Chris Mason wrote:
> > On Sun, Apr 10, 2016 at 12:04:21PM +0200, Mike Galbraith wrote:
> > > On Sat, 2016-04-09 at 15:05 -0400, Chris Mason wrote:
> > >
> > > > This does preserve the existing logi
On Sun, 2016-04-10 at 15:55 -0400, Chris Mason wrote:
> On Sun, Apr 10, 2016 at 12:04:21PM +0200, Mike Galbraith wrote:
> > On Sat, 2016-04-09 at 15:05 -0400, Chris Mason wrote:
> >
> > > This does preserve the existing logic to prefer idle cores over idle
> > > CPU threads, and includes some test
On Sun, Apr 10, 2016 at 12:04:21PM +0200, Mike Galbraith wrote:
> On Sat, 2016-04-09 at 15:05 -0400, Chris Mason wrote:
>
> > This does preserve the existing logic to prefer idle cores over idle
> > CPU threads, and includes some tests to try and avoid the idle scan when
> > we're
> > actually be
On Sun, 2016-04-10 at 08:35 -0400, Chris Mason wrote:
> What are you testing with? If it's two sockets or less I may be able to
> find one to reproduce with.
i4790 desktop box.
-Mike
On Sun, Apr 10, 2016 at 12:04:21PM +0200, Mike Galbraith wrote:
> On Sat, 2016-04-09 at 15:05 -0400, Chris Mason wrote:
>
> > This does preserve the existing logic to prefer idle cores over idle
> > CPU threads, and includes some tests to try and avoid the idle scan when
> > we're
> > actually be
On Sat, 2016-04-09 at 15:05 -0400, Chris Mason wrote:
> This does preserve the existing logic to prefer idle cores over idle
> CPU threads, and includes some tests to try and avoid the idle scan when we're
> actually better off sharing a non-idle CPU with someone else.
My box says the "oh nevermi
63 matches
Mail list logo