Re: Scheduler wakeup path tuning surface: Interface discussion

2020-11-11 Thread Parth Shah
I was analyzing LPC 2020 discussion regarding Latency-nice interface and have below points to initiate further discussion: 1. There was consensus that having interface like "Latency-nice" to provide scheduler hints about task latency requirement can be very useful. 2. There are two use-case regar

Re: Scheduler benchmarks

2020-08-19 Thread Greg KH
On Wed, Aug 19, 2020 at 12:43:52PM -0400, Valdis Klētnieks wrote: > On Wed, 19 Aug 2020 12:42:54 +0200, Greg KH said: > > Look up Spectre and Meltdown for many many examples of what happened and > > what went wrong with chip designs and how we had to fix these things in > > the kernel a few years a

Re: Scheduler benchmarks

2020-08-19 Thread Valdis Klētnieks
On Wed, 19 Aug 2020 12:42:54 +0200, Greg KH said: > Look up Spectre and Meltdown for many many examples of what happened and > what went wrong with chip designs and how we had to fix these things in > the kernel a few years ago. And I'm sure that nobody sane thinks we're done with security holes c

RE: Scheduler benchmarks

2020-08-19 Thread David Laight
From: Bernd Petrovitsch > Sent: 19 August 2020 11:22 > > On 19/08/2020 10:16, Muni Sekhar wrote: > > On Tue, Aug 18, 2020 at 11:45 PM peter enderborg > > wrote: > [...] > >> On the 4.4 kernel you dont have > >> > >> +CONFIG_RETPOLINE=y > >> +CONFIG_INTEL_RDT=y > > Thanks! That is helpful. Yes, I

Re: Scheduler benchmarks

2020-08-19 Thread Bernd Petrovitsch
On 19/08/2020 10:16, Muni Sekhar wrote: > On Tue, Aug 18, 2020 at 11:45 PM peter enderborg > wrote: [...] >> On the 4.4 kernel you dont have >> >> +CONFIG_RETPOLINE=y >> +CONFIG_INTEL_RDT=y > Thanks! That is helpful. Yes, I see 4.4 kernel don't have the above > two config options. > What analysis

Re: Scheduler benchmarks

2020-08-19 Thread Greg KH
On Wed, Aug 19, 2020 at 03:46:06PM +0530, Muni Sekhar wrote: > On Tue, Aug 18, 2020 at 11:45 PM peter enderborg > wrote: > > > > On 8/18/20 7:53 PM, Muni Sekhar wrote: > > > On Tue, Aug 18, 2020 at 11:06 PM Greg KH wrote: > > >> On Tue, Aug 18, 2020 at 11:01:35PM +0530, Muni Sekhar wrote: > > >>>

Re: Scheduler benchmarks

2020-08-19 Thread Muni Sekhar
On Tue, Aug 18, 2020 at 11:45 PM peter enderborg wrote: > > On 8/18/20 7:53 PM, Muni Sekhar wrote: > > On Tue, Aug 18, 2020 at 11:06 PM Greg KH wrote: > >> On Tue, Aug 18, 2020 at 11:01:35PM +0530, Muni Sekhar wrote: > >>> On Tue, Aug 18, 2020 at 10:44 PM Greg KH wrote: > On Tue, Aug 18, 20

Re: Scheduler benchmarks

2020-08-18 Thread peter enderborg
On 8/18/20 7:53 PM, Muni Sekhar wrote: > On Tue, Aug 18, 2020 at 11:06 PM Greg KH wrote: >> On Tue, Aug 18, 2020 at 11:01:35PM +0530, Muni Sekhar wrote: >>> On Tue, Aug 18, 2020 at 10:44 PM Greg KH wrote: On Tue, Aug 18, 2020 at 10:24:13PM +0530, Muni Sekhar wrote: > On Tue, Aug 18, 2020

Re: Scheduler benchmarks

2020-08-18 Thread Muni Sekhar
On Tue, Aug 18, 2020 at 11:06 PM Greg KH wrote: > > On Tue, Aug 18, 2020 at 11:01:35PM +0530, Muni Sekhar wrote: > > On Tue, Aug 18, 2020 at 10:44 PM Greg KH wrote: > > > > > > On Tue, Aug 18, 2020 at 10:24:13PM +0530, Muni Sekhar wrote: > > > > On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > >

Re: Scheduler benchmarks

2020-08-18 Thread Greg KH
On Tue, Aug 18, 2020 at 11:01:35PM +0530, Muni Sekhar wrote: > On Tue, Aug 18, 2020 at 10:44 PM Greg KH wrote: > > > > On Tue, Aug 18, 2020 at 10:24:13PM +0530, Muni Sekhar wrote: > > > On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > > > > > > > > On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni S

Re: Scheduler benchmarks

2020-08-18 Thread Muni Sekhar
On Tue, Aug 18, 2020 at 10:44 PM Greg KH wrote: > > On Tue, Aug 18, 2020 at 10:24:13PM +0530, Muni Sekhar wrote: > > On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > > > > > > On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni Sekhar wrote: > > > > Hi all, > > > > > > > > I’ve two identical Linux sys

Re: Scheduler benchmarks

2020-08-18 Thread Greg KH
On Tue, Aug 18, 2020 at 10:24:13PM +0530, Muni Sekhar wrote: > On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > > > > On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni Sekhar wrote: > > > Hi all, > > > > > > I’ve two identical Linux systems with only kernel differences. > > > > What are the differenc

Re: Scheduler benchmarks

2020-08-18 Thread Muni Sekhar
On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > > On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni Sekhar wrote: > > Hi all, > > > > I’ve two identical Linux systems with only kernel differences. > > What are the differences in the kernels? > > > While doing kernel profiling with perf, I got the be

Re: Scheduler benchmarks

2020-08-18 Thread Muni Sekhar
On Tue, Aug 18, 2020 at 8:06 PM Greg KH wrote: > > On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni Sekhar wrote: > > Hi all, > > > > I’ve two identical Linux systems with only kernel differences. > > What are the differences in the kernels? > > > While doing kernel profiling with perf, I got the be

Re: Scheduler benchmarks

2020-08-18 Thread Greg KH
On Tue, Aug 18, 2020 at 08:00:11PM +0530, Muni Sekhar wrote: > Hi all, > > I’ve two identical Linux systems with only kernel differences. What are the differences in the kernels? > While doing kernel profiling with perf, I got the below mentioned > metrics for Scheduler benchmarks. > > 1st syst

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-10-04 Thread Jon Masters
On 9/7/18 5:34 AM, Jirka Hladky wrote: > We would also be more than happy to test the new patches for the > performance - please let us know if you are interested. We have a > pool of 1 NUMA up to 8 NUMA boxes for that, both AMD and Intel, > covering different CPU generations from Sandy Bridge ti

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-17 Thread Jirka Hladky
Resending in the plain text mode. > I'm travelling at the moment but when I get back, I'll see what's in the > tip tree with respect to Srikar's patches and then rebase the fast-migration > patches on top and reconfirm they still behave as expected. Assuming > they do, I'll resend them. Sounds g

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-17 Thread Mel Gorman
On Fri, Sep 14, 2018 at 04:50:20PM +0200, Jirka Hladky wrote: > Hi Peter and Srikar, > > > I have bounced the 5 patches to you, (one of the 6 has not been applied by > > Peter) so I have skipped that. > > They can also be fetched from > > http://lore.kernel.org/lkml/1533276841-16341-1-git-send-ema

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-14 Thread Jirka Hladky
Hi Peter and Srikar, > I have bounced the 5 patches to you, (one of the 6 has not been applied by > Peter) so I have skipped that. > They can also be fetched from > http://lore.kernel.org/lkml/1533276841-16341-1-git-send-email-sri...@linux.vnet.ibm.com I'm sorry for the delay, we have finally the

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-09 Thread Jirka Hladky
Hi Peter and Srikar, thanks a lot for the information and for the patches to test! > I have bounced the 5 patches to you, (one of the 6 has not been applied by > Peter) so I have skipped that. > They can also be fetched from > http://lore.kernel.org/lkml/1533276841-16341-1-git-send-email-sri...@l

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-07 Thread Peter Zijlstra
On Fri, Sep 07, 2018 at 07:14:20PM +0530, Srikar Dronamraju wrote: > * Peter Zijlstra [2018-09-07 15:19:23]: > > > On Fri, Sep 07, 2018 at 06:26:49PM +0530, Srikar Dronamraju wrote: > > > > > Can you please pick > > > > > > > > > 1. 69bb3230297e881c797bbc4b3dbf73514078bc9d sched/numa: Stop mul

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-07 Thread Srikar Dronamraju
* Peter Zijlstra [2018-09-07 15:19:23]: > On Fri, Sep 07, 2018 at 06:26:49PM +0530, Srikar Dronamraju wrote: > > > Can you please pick > > > > > > 1. 69bb3230297e881c797bbc4b3dbf73514078bc9d sched/numa: Stop multiple tasks > > from moving to the cpu at the same time > > 2. dc62cfdac5e5b7a61cd8

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-07 Thread Peter Zijlstra
On Fri, Sep 07, 2018 at 06:26:49PM +0530, Srikar Dronamraju wrote: > Can you please pick > > > 1. 69bb3230297e881c797bbc4b3dbf73514078bc9d sched/numa: Stop multiple tasks > from moving to the cpu at the same time > 2. dc62cfdac5e5b7a61cd8a2bd4190e80b9bb408fc sched/numa: Avoid task migration > fo

Re: [SCHEDULER] Performance drop in 4.19 compared to 4.18 kernel

2018-09-07 Thread Srikar Dronamraju
* Jirka Hladky [2018-09-07 11:34:49]: Hi Jirka, > > We have detected a significant performance drop (20% and more) with > 4.19rc1 relatively to 4.18 vanilla. We see the regression on different > 2 NUMA and 4 NUMA boxes with pretty much all the benchmarks we use - > NAS, Stream, SPECjbb2005, SPE

Re: Scheduler patches: 6x performance increase when system is under heavy load

2016-12-20 Thread Peter Zijlstra
Sorry for the delay, got side-tracked for a bit.. On Wed, Dec 14, 2016 at 12:15:25AM -0500, Alexandre-Xavier Labonté-Lamoureux wrote: > > Which of the 4 patches does this? > > I used all the 4 patches at the same time. Each patch fixes a > different bug. Would you like me to try each of them in

Re: Scheduler patches: 6x performance increase when system is under heavy load

2016-12-13 Thread Alexandre-Xavier Labonté-Lamoureux
> Which of the 4 patches does this? I used all the 4 patches at the same time. Each patch fixes a different bug. Would you like me to try each of them individually? Were you already aware of each of these bugs? > Also, what hypervisor are you using and what does the output of booting > with "sche

Re: Scheduler patches: 6x performance increase when system is under heavy load

2016-12-13 Thread Peter Zijlstra
On Sun, Dec 11, 2016 at 04:41:51PM -0500, Alexandre-Xavier Labonté-Lamoureux wrote: > > Here are my results (using "time make -j32" on my VM that has 4 cores): > > Kernel 4.8.14 > real 26m56.151s > user 79m52.472s > sys 7m42.964s > > Same kernel, but patched: > real 4m25.238s > user 1

Re: scheduler crash on Power

2014-08-04 Thread Dietmar Eggemann
On 04/08/14 04:20, Michael Ellerman wrote: > On Fri, 2014-08-01 at 14:24 -0700, Sukadev Bhattiprolu wrote: >> Dietmar Eggemann [dietmar.eggem...@arm.com] wrote: >> | > ltcbrazos2-lp07 login: [ 181.915974] [ cut here >> ] >> | > [ 181.915991] WARNING: at ../kernel/sched/co

Re: scheduler crash on Power

2014-08-03 Thread Michael Ellerman
On Fri, 2014-08-01 at 14:24 -0700, Sukadev Bhattiprolu wrote: > Dietmar Eggemann [dietmar.eggem...@arm.com] wrote: > | > ltcbrazos2-lp07 login: [ 181.915974] [ cut here ] > | > [ 181.915991] WARNING: at ../kernel/sched/core.c:5881 > | > | This warning indicates the proble

Re: scheduler crash on Power

2014-08-01 Thread Sukadev Bhattiprolu
Dietmar Eggemann [dietmar.eggem...@arm.com] wrote: | > ltcbrazos2-lp07 login: [ 181.915974] [ cut here ] | > [ 181.915991] WARNING: at ../kernel/sched/core.c:5881 | | This warning indicates the problem. One of the struct sched_domains does | not have it's groups member se

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-08-01 Thread Christoph Lameter
On Thu, 31 Jul 2014, Lai Jiangshan wrote: > > this_cpu_ptr instead. > > > - struct cpumask *cpus = __get_cpu_var(load_balance_mask); > + struct cpumask *cpus = this_cpu_ptr(load_balance_mask); > > > I think the conversion is wrong. it should be > *this_cpu_ptr(&lo

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-08-01 Thread Christoph Lameter
On Thu, 31 Jul 2014, Fengguang Wu wrote: > Sorry I find that next-20140730 no long show the BUG. So there is no > way to test whether this patch fixed the problem. I guess this means that the bug was unrelated to this patch. Nevertheless I think this patch cleans up two minor issues. -- To unsubs

Re: scheduler crash on Power

2014-07-31 Thread Michael Ellerman
On Wed, 2014-07-30 at 00:22 -0700, Sukadev Bhattiprolu wrote: > I am getting this crash on a Powerpc system using 3.16.0-rc7 kernel plus > some patches related to perf (24x7 counters) that Cody Schafer posted here: > > https://lkml.org/lkml/2014/5/27/768 > > I don't get the crash on an unpa

Re: scheduler crash on Power

2014-07-31 Thread Dietmar Eggemann
Hi Sukadev, On 30/07/14 08:22, Sukadev Bhattiprolu wrote: > > I am getting this crash on a Powerpc system using 3.16.0-rc7 kernel plus > some patches related to perf (24x7 counters) that Cody Schafer posted here: > > https://lkml.org/lkml/2014/5/27/768 > > I don't get the crash on an unpa

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-07-31 Thread Fengguang Wu
Christoph, On Wed, Jul 30, 2014 at 09:55:29AM -0500, Christoph Lameter wrote: > On Wed, 30 Jul 2014, Fengguang Wu wrote: > > > FYI, this commit seems to convert some kernel boot hang bug into > > different BUG messages. > > Hmmm. Still a bit confused as to why these messages occur.. Does this >

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-07-31 Thread Lai Jiangshan
On 07/30/2014 10:55 PM, Christoph Lameter wrote: > On Wed, 30 Jul 2014, Fengguang Wu wrote: > >> FYI, this commit seems to convert some kernel boot hang bug into >> different BUG messages. > > Hmmm. Still a bit confused as to why these messages occur.. Does this > patch do any good? The vmstat b

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-07-31 Thread Lai Jiangshan
On 07/30/2014 09:56 PM, Fengguang Wu wrote: > Hi Christoph, > > FYI, this commit seems to convert some kernel boot hang bug into > different BUG messages. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git > for-3.17-consistent-ops > commit 9b0c63851edaf54e909475fe2a0946f57810e98a >

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000d110

2014-07-30 Thread David Rientjes
On Wed, 30 Jul 2014, Fengguang Wu wrote: > > Hi Christoph, > > The parent commit is clean in this case. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git > for-3.17-consistent-ops > commit 9b0c63851edaf54e909475fe2a0946f57810e98a > Author: Christoph Lameter > AuthorDate: Fri

Re: [scheduler] BUG: unable to handle kernel paging request at 000000000000ce50

2014-07-30 Thread Christoph Lameter
On Wed, 30 Jul 2014, Fengguang Wu wrote: > FYI, this commit seems to convert some kernel boot hang bug into > different BUG messages. Hmmm. Still a bit confused as to why these messages occur.. Does this patch do any good? Subject: vmstat ondemand: Fix online/offline races Do not allow onlinin

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-23 Thread Bruno Wolff III
On Wed, Jul 23, 2014 at 17:11:40 +0200, Peter Zijlstra wrote: OK, so that's become the below patch. I'll feed it to Ingo if that's OK with hpa. I tested this patch on 3 machines and it continued to fix the one that was broken and didn't seem to break anything on the two that weren't broken

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-23 Thread H. Peter Anvin
On 07/23/2014 08:11 AM, Peter Zijlstra wrote: > > OK, so that's become the below patch. I'll feed it to Ingo if that's OK > with hpa. > I'll grab it directly, it is a bit quicker that way. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-23 Thread Peter Zijlstra
OK, so that's become the below patch. I'll feed it to Ingo if that's OK with hpa. --- Subject: x86: Fix cache topology for early P4-SMT From: Peter Zijlstra Date: Tue, 22 Jul 2014 15:35:14 +0200 P4 systems with cpuid level < 4 can have SMT, but the cache topology description available (cpuid2)

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 08:37:19PM -0500, Bruno Wolff III wrote: >build_sched_domain: cpu: 0 level: SMT cpu_map: 0-3 tl->mask: > 0,2 > [0.252441] build_sched_domain: cpu: 0 level: MC cpu_map: 0-3 tl->mask: 0,2 > [0.252526] build_sched_domain: cpu: 0 level: DIE cpu_map: 0-3

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Bruno Wolff III
On Tue, Jul 22, 2014 at 16:18:55 +0200, Peter Zijlstra wrote: You can put this on top of them. I hope that this will make the pr_err() introduced in the robustify patch go away. I went to 3.16-rc6 and then reapplied three patches from your previous email messages. The dmesg output and the d

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread H. Peter Anvin
On 07/22/2014 06:35 AM, Peter Zijlstra wrote: > On Tue, Jul 22, 2014 at 03:26:03PM +0200, Peter Zijlstra wrote: >> On Tue, Jul 22, 2014 at 03:03:43PM +0200, Peter Zijlstra wrote: >>> Oh, of course we do SMP detection and setup after the cache setup... >>> lovely. >>> >>> /me goes bang head against

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 09:09:12AM -0500, Bruno Wolff III wrote: > On Tue, Jul 22, 2014 at 15:35:14 +0200, > Peter Zijlstra wrote: > >On Tue, Jul 22, 2014 at 03:26:03PM +0200, Peter Zijlstra wrote: > > > >Something like so.. anything obviously broken? > > Do you want me to test this change inste

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Bruno Wolff III
On Tue, Jul 22, 2014 at 15:35:14 +0200, Peter Zijlstra wrote: On Tue, Jul 22, 2014 at 03:26:03PM +0200, Peter Zijlstra wrote: Something like so.. anything obviously broken? Do you want me to test this change instead of, or combined with the other patch you wanted tested earlier? --- arc

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 03:26:03PM +0200, Peter Zijlstra wrote: > On Tue, Jul 22, 2014 at 03:03:43PM +0200, Peter Zijlstra wrote: > > Oh, of course we do SMP detection and setup after the cache setup... > > lovely. > > > > /me goes bang head against wall > > hpa, could we move the legacy cpuid1/c

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 03:03:43PM +0200, Peter Zijlstra wrote: > Oh, of course we do SMP detection and setup after the cache setup... > lovely. > > /me goes bang head against wall hpa, could we move the legacy cpuid1/cpuid4 topology detection muck up, preferably right after detect_extended_topol

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 07:10:01AM -0500, Bruno Wolff III wrote: > On Tue, Jul 22, 2014 at 12:38:57 +0200, > Peter Zijlstra wrote: > > > >Could you provide the output of cpuid and cpuid -r for your machine? > >This code is magic and I've no idea what your machine is telling it to > >do :/ > > I

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Bruno Wolff III
On Tue, Jul 22, 2014 at 11:47:40 +0200, Peter Zijlstra wrote: On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote: On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote: > Is there more I can do to help with this now? Or should I just wait for > patches to test? Yeah, sor

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Bruno Wolff III
On Tue, Jul 22, 2014 at 12:38:57 +0200, Peter Zijlstra wrote: Could you provide the output of cpuid and cpuid -r for your machine? This code is magic and I've no idea what your machine is telling it to do :/ I am attaching both sets of output. (I also added copies to the bug report.) CPU 0:

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Dietmar Eggemann
On 22/07/14 10:47, Peter Zijlstra wrote: > On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote: >> On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote: >>> Is there more I can do to help with this now? Or should I just wait for >>> patches to test? >> >> Yeah, sorry, was wipe

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 11:47:40AM +0200, Peter Zijlstra wrote: > On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote: > > On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote: > > > Is there more I can do to help with this now? Or should I just wait for > > > patches to test?

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-22 Thread Peter Zijlstra
On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote: > On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote: > > Is there more I can do to help with this now? Or should I just wait for > > patches to test? > > Yeah, sorry, was wiped out today. I'll go stare harder at the P4 >

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-21 Thread Peter Zijlstra
On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote: > Is there more I can do to help with this now? Or should I just wait for > patches to test? Yeah, sorry, was wiped out today. I'll go stare harder at the P4 topology setup code tomorrow. Something fishy there. -- To unsubscribe from

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-21 Thread Bruno Wolff III
Is there more I can do to help with this now? Or should I just wait for patches to test? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Peter Zijlstra
On Fri, Jul 18, 2014 at 04:50:40PM +0200, Peter Zijlstra wrote: > On Fri, Jul 18, 2014 at 04:16:48PM +0200, Peter Zijlstra wrote: > > On Fri, Jul 18, 2014 at 08:01:26AM -0500, Bruno Wolff III wrote: > > > build_sched_domain: cpu: 0 level: SMT cpu_map: 0-3 tl->mask: 0,2 > > > [0.254433] build_sc

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Peter Zijlstra
On Fri, Jul 18, 2014 at 04:16:48PM +0200, Peter Zijlstra wrote: > On Fri, Jul 18, 2014 at 08:01:26AM -0500, Bruno Wolff III wrote: > > build_sched_domain: cpu: 0 level: SMT cpu_map: 0-3 tl->mask: 0,2 > > [0.254433] build_sched_domain: cpu: 0 level: MC cpu_map: 0-3 tl->mask: 0 > > [0.254516]

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Dietmar Eggemann
On 18/07/14 15:01, Bruno Wolff III wrote: On Fri, Jul 18, 2014 at 12:16:33 +0200, Peter Zijlstra wrote: So it looks like the actual domain tree is broken, and not what we assumed it was. Could I bother you to run with the below instead? It should also print out the sched domain masks so we

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Peter Zijlstra
On Fri, Jul 18, 2014 at 08:01:26AM -0500, Bruno Wolff III wrote: > build_sched_domain: cpu: 0 level: SMT cpu_map: 0-3 tl->mask: 0,2 > [0.254433] build_sched_domain: cpu: 0 level: MC cpu_map: 0-3 tl->mask: 0 > [0.254516] build_sched_domain: cpu: 0 level: DIE cpu_map: 0-3 tl->mask: > 0-3 > [

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Bruno Wolff III
On Fri, Jul 18, 2014 at 12:16:33 +0200, Peter Zijlstra wrote: So it looks like the actual domain tree is broken, and not what we assumed it was. Could I bother you to run with the below instead? It should also print out the sched domain masks so we don't need to guess about them. The full dm

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Bruno Wolff III
On Fri, Jul 18, 2014 at 11:28:14 +0200, Dietmar Eggemann wrote: Didn't see what I was looking for in your dmesg output. Did you use 'earlyprintk=keep sched_debug' I was missing a space. I'll get it on the next run. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Peter Zijlstra
On Fri, Jul 18, 2014 at 12:34:49AM -0500, Bruno Wolff III wrote: > On Thu, Jul 17, 2014 at 14:35:02 +0200, > Peter Zijlstra wrote: > > > >In any case, can someone who can trigger this run with the below; its > >'clean' for me, but supposedly you'll trigger a FAIL somewhere. > > I got a couple of

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-18 Thread Dietmar Eggemann
On 18/07/14 07:34, Bruno Wolff III wrote: On Thu, Jul 17, 2014 at 14:35:02 +0200, Peter Zijlstra wrote: In any case, can someone who can trigger this run with the below; its 'clean' for me, but supposedly you'll trigger a FAIL somewhere. I got a couple of fail messages. dmesg output is a

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Bruno Wolff III
On Thu, Jul 17, 2014 at 14:35:02 +0200, Peter Zijlstra wrote: In any case, can someone who can trigger this run with the below; its 'clean' for me, but supposedly you'll trigger a FAIL somewhere. I got a couple of fail messages. dmesg output is available in the bug as the following attachme

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Bruno Wolff III
On Thu, Jul 17, 2014 at 20:43:16 +0200, Dietmar Eggemann wrote: If you could apply the patch: https://lkml.org/lkml/2014/7/17/288 and then run it on your machine, that would give us more details, i.e. the information on which sched_group(s) and in which sched domain level (SMT and/or DIE)

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Dietmar Eggemann
On 17/07/14 18:36, Bruno Wolff III wrote: I did a few quick boots this morning while taking a bunch of pictures. I have gone through some of them this morning and found one that shows bug on was triggered at 5850 which is from: BUG_ON(!cpumask_empty(sched_group_cpus(sg))); You can see the JPEG a

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Bruno Wolff III
I did a few quick boots this morning while taking a bunch of pictures. I have gone through some of them this morning and found one that shows bug on was triggered at 5850 which is from: BUG_ON(!cpumask_empty(sched_group_cpus(sg))); You can see the JPEG at: https://bugzilla.kernel.org/attachment

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Peter Zijlstra
On Thu, Jul 17, 2014 at 01:23:51PM +0200, Dietmar Eggemann wrote: > On 17/07/14 11:04, Peter Zijlstra wrote: > >On Thu, Jul 17, 2014 at 10:57:55AM +0200, Dietmar Eggemann wrote: > >>There is also the possibility that the memory for sched_group sg is not > >>(completely) zeroed out: > >> > >> sg =

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Dietmar Eggemann
On 17/07/14 11:04, Peter Zijlstra wrote: On Thu, Jul 17, 2014 at 10:57:55AM +0200, Dietmar Eggemann wrote: There is also the possibility that the memory for sched_group sg is not (completely) zeroed out: sg = kzalloc_node(sizeof(struct sched_group) + cpumask_size(), G

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Peter Zijlstra
On Thu, Jul 17, 2014 at 10:57:55AM +0200, Dietmar Eggemann wrote: > There is also the possibility that the memory for sched_group sg is not > (completely) zeroed out: > > sg = kzalloc_node(sizeof(struct sched_group) + cpumask_size(), > GFP_KERNEL, cpu_to_node(j)); > > >

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-17 Thread Dietmar Eggemann
On 17/07/14 05:09, Bruno Wolff III wrote: On Thu, Jul 17, 2014 at 01:18:36 +0200, Dietmar Eggemann wrote: So the output of $ cat /proc/sys/kernel/sched_domain/cpu*/domain*/* would be handy too. Thanks, this was helpful. I see from the sched domain layout that you have SMT (domain0) and D

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Bruno Wolff III
On Wed, Jul 16, 2014 at 21:17:32 +0200, Dietmar Eggemann wrote: Could you please share: cat /proc/cpuinfo and cat /proc/schedstat (kernel config w/ CONFIG_SCHEDSTATS=y) /proc/schedstat output is attached. version 15 timestamp 4294858660 cpu0 12 0 85767 30027 61826 37767 15709950719 562024106

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Bruno Wolff III
Could you also put the two BUG_ON lines into build_sched_groups() [kernel/sched/core.c] wo/ the cpumask_clear() and setting sg->sgc->capacity to 0 and share the possible crash output as well? I can try a new build with this. I can probably get results back tomorrow before I leave for work. The c

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Bruno Wolff III
On Thu, Jul 17, 2014 at 01:18:36 +0200, Dietmar Eggemann wrote: So the output of $ cat /proc/sys/kernel/sched_domain/cpu*/domain*/* would be handy too. Attached and added to the bug. Just to make sure, you do have 'CONFIG_X86_32=y' and '# CONFIG_NUMA is not set' in your build? Yes. I p

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Dietmar Eggemann
On 16/07/14 21:54, Bruno Wolff III wrote: On Wed, Jul 16, 2014 at 21:17:32 +0200, Dietmar Eggemann wrote: Hi Bruno and Josh, From the issue, I see that the machine making trouble is an Xeon (2 processors w/ hyper-threading). Could you please share: cat /proc/cpuinfo and I have attached

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Bruno Wolff III
On Wed, Jul 16, 2014 at 21:17:32 +0200, Dietmar Eggemann wrote: Hi Bruno and Josh, From the issue, I see that the machine making trouble is an Xeon (2 processors w/ hyper-threading). Could you please share: cat /proc/cpuinfo and I have attached it to the bug and to this message. cat /p

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Dietmar Eggemann
Hi Bruno and Josh, On 16/07/14 17:17, Josh Boyer wrote: Adding Dietmar in since he is the original author. josh On Wed, Jul 16, 2014 at 09:55:46AM -0500, Bruno Wolff III wrote: caffcdd8d27ba78730d5540396ce72ad022aff2c has been causing crashes early in the boot process on one of three machines

Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

2014-07-16 Thread Josh Boyer
Adding Dietmar in since he is the original author. josh On Wed, Jul 16, 2014 at 09:55:46AM -0500, Bruno Wolff III wrote: > caffcdd8d27ba78730d5540396ce72ad022aff2c has been causing crashes > early in the boot process on one of three machines I have been > testing the kernel on. On that one machin

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread David Ahern
On 6/26/13 10:10 AM, Ingo Molnar wrote: Sampled H/W events have an adaptive period that converges to the desired sampling rate. The first few samples come in 10 usecs are so apart and the time period expands to the desired rate. As I recall that adaptive algorithm starts over every time the event

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread Ingo Molnar
* David Ahern wrote: > On 6/26/13 9:50 AM, Ingo Molnar wrote: > > > >* Peter Zijlstra wrote: > > > >>On Wed, Jun 26, 2013 at 11:37:13AM +0200, Ingo Molnar wrote: > >>>Would be very nice to randomize the sampling rate, by randomizing the > >>>intervals within a 1% range or so - perf tooling will

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread David Ahern
On 6/26/13 9:50 AM, Ingo Molnar wrote: * Peter Zijlstra wrote: On Wed, Jun 26, 2013 at 11:37:13AM +0200, Ingo Molnar wrote: Would be very nice to randomize the sampling rate, by randomizing the intervals within a 1% range or so - perf tooling will probably recognize the different weights.

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread Mike Galbraith
On Wed, 2013-06-26 at 17:50 +0200, Ingo Molnar wrote: > * Peter Zijlstra wrote: > > > On Wed, Jun 26, 2013 at 11:37:13AM +0200, Ingo Molnar wrote: > > > Would be very nice to randomize the sampling rate, by randomizing the > > > intervals within a 1% range or so - perf tooling will probably rec

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Wed, Jun 26, 2013 at 11:37:13AM +0200, Ingo Molnar wrote: > > Would be very nice to randomize the sampling rate, by randomizing the > > intervals within a 1% range or so - perf tooling will probably recognize > > the different weights. > > You're suggesting adding

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread Peter Zijlstra
On Wed, Jun 26, 2013 at 11:37:13AM +0200, Ingo Molnar wrote: > Would be very nice to randomize the sampling rate, by randomizing the > intervals within a 1% range or so - perf tooling will probably recognize > the different weights. You're suggesting adding noise to the regular kernel tick? -- T

Re: Scheduler accounting inflated for io bound processes.

2013-06-26 Thread Ingo Molnar
* Mike Galbraith wrote: > On Tue, 2013-06-25 at 18:01 +0200, Mike Galbraith wrote: > > On Thu, 2013-06-20 at 14:46 -0500, Dave Chiluk wrote: > > > Running the below testcase shows each process consuming 41-43% of it's > > > respective cpu while per core idle numbers show 63-65%, a disparity of

Re: Scheduler accounting inflated for io bound processes.

2013-06-25 Thread Mike Galbraith
On Tue, 2013-06-25 at 18:01 +0200, Mike Galbraith wrote: > On Thu, 2013-06-20 at 14:46 -0500, Dave Chiluk wrote: > > Running the below testcase shows each process consuming 41-43% of it's > > respective cpu while per core idle numbers show 63-65%, a disparity of > > roughly 4-8%. Is this a bug,

Re: Scheduler accounting inflated for io bound processes.

2013-06-25 Thread Mike Galbraith
On Thu, 2013-06-20 at 14:46 -0500, Dave Chiluk wrote: > Running the below testcase shows each process consuming 41-43% of it's > respective cpu while per core idle numbers show 63-65%, a disparity of > roughly 4-8%. Is this a bug, known behaviour, or consequence of the > process being io bound?

Re: scheduler context

2013-04-23 Thread Henrik Austad
Hi Ratessh, Before digging into your questions; - LKML and linux-rt is probably not the best places to ask these kind of questions. kernelnewbies is an excellent place to go to for quidance, and you may get faster response there as well. Furthermore; I recommend you to pick up a book about kern

Re: Scheduler queues for less os-jitter?

2012-11-04 Thread Mike Galbraith
On Sun, 2012-11-04 at 10:20 +0100, Uwaysi Bin Kareem wrote: > Ok, anyway realtime processes did not work quite as expected. > ("overloaded" machine, even though cpu-time is only 10%). So I guess I > have to enable cgroups and live with the overhead then. > > If I set cpu-limits there, does th

Re: Scheduler queues for less os-jitter?

2012-11-04 Thread Uwaysi Bin Kareem
Ok, anyway realtime processes did not work quite as expected. ("overloaded" machine, even though cpu-time is only 10%). So I guess I have to enable cgroups and live with the overhead then. If I set cpu-limits there, does that involve an absolute value, or is it normalized, so that even if I

Fwd: Re: Scheduler queues for less os-jitter?

2012-11-03 Thread Uwaysi Bin Kareem
--- Forwarded message --- From: "Uwaysi Bin Kareem" To: "Mike Galbraith" Cc: Subject: Re: Scheduler queues for less os-jitter? Date: Sun, 04 Nov 2012 02:19:39 +0100 On Thu, 11 Oct 2012 04:46:34 +0200, Mike Galbraith wrote: On Wed, 2012-10-10 at 20:13 +0200, Uwa

Re: Scheduler queues for less os-jitter?

2012-10-10 Thread Mike Galbraith
On Wed, 2012-10-10 at 20:13 +0200, Uwaysi Bin Kareem wrote: > I was just wondering, have you considered this? > > If daemons are contributing to os-jitter, wouldn`t having them all on > their own queue reduce jitter? So people could have the stuff like in > Ubuntu they want, without affecting

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-12 Thread Mike Galbraith
On Tue, 2008-02-12 at 10:23 +0100, Mike Galbraith wrote: > If you plunk a usleep(1) in prior to calling thread_func() does your > testcase performance change radically? If so, I wonder if the real > application has the same kind of dependency. The answer is yes for 2.6.22, and no for 2.6.24, wh

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-12 Thread Mike Galbraith
On Mon, 2008-02-11 at 14:31 -0600, Olof Johansson wrote: > On Mon, Feb 11, 2008 at 08:58:46PM +0100, Mike Galbraith wrote: > > It shouldn't matter if you yield or not really, that should reduce the > > number of non-work spin cycles wasted awaiting preemption as threads > > execute in series (the

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-11 Thread Mike Galbraith
On Mon, 2008-02-11 at 16:45 -0500, Bill Davidsen wrote: > I think the moving to another CPU gets really dependent on the CPU type. > On a P4+HT the caches are shared, and moving costs almost nothing for > cache hits, while on CPUs which have other cache layouts the migration > cost is higher.

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-11 Thread Bill Davidsen
Olof Johansson wrote: However, I fail to understand the goal of the reproducer. Granted it shows irregularities in the scheduler under such conditions, but what *real* workload would spend its time sequentially creating then immediately killing threads, never using more than 2 at a time ? If th

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-11 Thread Olof Johansson
On Mon, Feb 11, 2008 at 08:58:46PM +0100, Mike Galbraith wrote: > > On Mon, 2008-02-11 at 11:26 -0600, Olof Johansson wrote: > > On Mon, Feb 11, 2008 at 09:15:55AM +0100, Mike Galbraith wrote: > > > Piddling around with your testcase, it still looks to me like things > > > improved considerably i

Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads

2008-02-11 Thread Mike Galbraith
On Mon, 2008-02-11 at 11:26 -0600, Olof Johansson wrote: > On Mon, Feb 11, 2008 at 09:15:55AM +0100, Mike Galbraith wrote: > > Piddling around with your testcase, it still looks to me like things > > improved considerably in latest greatest git. Hopefully that means > > happiness is in the pipe

  1   2   3   >