Re: Random panic in load_balance() with 3.16-rc

2014-08-04 Thread Steven Rostedt
On Fri, 25 Jul 2014 11:29:06 -0700 Linus Torvalds wrote: > On Fri, Jul 25, 2014 at 7:02 AM, Steven Rostedt wrote: > > > > But wouldn't it be rather trivial to run a static analyzer on the final > > vmlinux to make sure there are no red zones? I mean, you would only need > > to read each function

Re: Random panic in load_balance() with 3.16-rc

2014-07-29 Thread Michel Dänzer
On 27.07.2014 03:02, Steven Chamberlain wrote: > On 25/07/14 02:25, Michel Dänzer wrote: >> Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm >> going to try reproducing the problem with a kernel built by that now. > > It looks like gcc-4.9 Debian package version 4.9.1-2 avail

Re: Random panic in load_balance() with 3.16-rc

2014-07-29 Thread Jakub Jelinek
On Mon, Jul 28, 2014 at 08:09:02PM +0200, Markus Trippelsdorf wrote: > Here's the testcase: > > int a, b, c; > void fn1 () > { > int d; > if (fn2 () && !0) > { > b = ( >{ >int e; >fn3 (); >switch (0) >default: >a

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Michel Dänzer
On 29.07.2014 01:48, Linus Torvalds wrote: > On Sun, Jul 27, 2014 at 8:47 PM, Michel Dänzer wrote: >> On 27.07.2014 04:56, Linus Torvalds wrote: >>> >>> Also, Michel - can you try this patch if you still have your >>> gcc-4.9.0 install, and send me the resulting fair.s file again? >> >> Attached.

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Theodore Ts'o
On Mon, Jul 28, 2014 at 10:27:39AM -0700, Alexei Starovoitov wrote: > > It's not pretty, but adding it unconditionally was the right thing to do. > Black listing compiler versions is too fragile. > Look at the flip side: now size of build dir will be much smaller :) White-listing the fixed compil

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Markus Trippelsdorf
On 2014.07.28 at 11:28 -0700, Linus Torvalds wrote: > On Mon, Jul 28, 2014 at 11:09 AM, Markus Trippelsdorf > wrote: > > > > It shouldn't be too hard to implement a simple check for the bug in the > > next release. Just compile the gcc/testsuite/gcc.target/i386/pr61801.c > > testcase with -fcompar

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Linus Torvalds
On Mon, Jul 28, 2014 at 11:09 AM, Markus Trippelsdorf wrote: > > It shouldn't be too hard to implement a simple check for the bug in the > next release. Just compile the gcc/testsuite/gcc.target/i386/pr61801.c > testcase with -fcompare-debug. If gcc returns 0 then > -fvar-tracking-assignments coul

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Markus Trippelsdorf
On 2014.07.28 at 10:27 -0700, Alexei Starovoitov wrote: > On Mon, Jul 28, 2014 at 09:45:45AM -0700, Linus Torvalds wrote: > > On Mon, Jul 28, 2014 at 5:26 AM, Frank Ch. Eigler wrote: > > > > > > Please note that the data produced by "-g -fvar-tracking" is consumed > > > by tools like systemtap, pe

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Alexei Starovoitov
On Mon, Jul 28, 2014 at 09:45:45AM -0700, Linus Torvalds wrote: > On Mon, Jul 28, 2014 at 5:26 AM, Frank Ch. Eigler wrote: > > > > Please note that the data produced by "-g -fvar-tracking" is consumed > > by tools like systemtap, perf, crash, and makes a significant > > difference to the observabi

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Linus Torvalds
On Sun, Jul 27, 2014 at 8:47 PM, Michel Dänzer wrote: > On 27.07.2014 04:56, Linus Torvalds wrote: >> >> Also, Michel - can you try this patch if you still have your >> gcc-4.9.0 install, and send me the resulting fair.s file again? > > Attached. The frame setup looks fine to me now (apart from

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Linus Torvalds
On Mon, Jul 28, 2014 at 5:26 AM, Frank Ch. Eigler wrote: > > Please note that the data produced by "-g -fvar-tracking" is consumed > by tools like systemtap, perf, crash, and makes a significant > difference to the observability of debug AND non-debug kernels. Yeah, and compared to having a buggy

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Frank Ch. Eigler
Hi - On Mon, Jul 28, 2014 at 09:10:04AM -0400, Theodore Ts'o wrote: > [...] > I thought Markus told us that -fno-var-tracking-assignments makes > absolutely no difference for non-debug kernels? It does affect CONFIG_DEBUG_INFO kernels, and that config option is set for all Red Hat kernels (-debug

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Frank Ch. Eigler
torvalds wrote: > [...] > Actually, I prefer my patch that did it with cc-option checking, and > does it unconditionally. > > Because if we do it even for non-debug builds - where it ostensibly > shouldn't matter - we then have that GCC_COMPARE_DEBUG thing working > regardless of configuration.

Re: Random panic in load_balance() with 3.16-rc

2014-07-28 Thread Theodore Ts'o
On Mon, Jul 28, 2014 at 08:26:59AM -0400, Frank Ch. Eigler wrote: > Please note that the data produced by "-g -fvar-tracking" is consumed > by tools like systemtap, perf, crash, and makes a significant > difference to the observability of debug AND non-debug kernels. (The > presence of compiled-in

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Jakub Jelinek
On Sat, Jul 26, 2014 at 10:20:55PM +0200, Markus Trippelsdorf wrote: > On 2014.07.26 at 15:55 -0400, Theodore Ts'o wrote: > > On Sat, Jul 26, 2014 at 09:35:57PM +0200, Markus Trippelsdorf wrote: > > > > > > But fortunately the workaround for the new inode.c bug is the same as > > > for the origina

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Linus Torvalds
On Sat, Jul 26, 2014 at 1:19 PM, Markus Trippelsdorf wrote: > > Yes. The option only affects -g builds. Ok, good. I'll wait a bit to hopefully get confirmation from Michel's setup, but this does seem to be the solution. > So, the option should only be enabled for debugging builds. Something > li

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Markus Trippelsdorf
On 2014.07.26 at 15:55 -0400, Theodore Ts'o wrote: > On Sat, Jul 26, 2014 at 09:35:57PM +0200, Markus Trippelsdorf wrote: > > > > But fortunately the workaround for the new inode.c bug is the same as > > for the original bug: -fno-var-tracking-assignments. > > > > It would make sense to enabled

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Markus Trippelsdorf
On 2014.07.26 at 12:56 -0700, Linus Torvalds wrote: > On Sat, Jul 26, 2014 at 12:35 PM, Markus Trippelsdorf > wrote: > > > > But fortunately the workaround for the new inode.c bug is the same as > > for the original bug: -fno-var-tracking-assignments. > > > > It would make sense to enabled it unco

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Linus Torvalds
On Sat, Jul 26, 2014 at 12:56 PM, Linus Torvalds wrote: > > Also, Michel - can you try this patch if you still have your > gcc-4.9.0 install, and send me the resulting fair.s file again? Hmm. The good news is that with that patch, the GCC_COMPARE_DEBUG build succeeds. At least for my small local

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Linus Torvalds
On Sat, Jul 26, 2014 at 12:35 PM, Markus Trippelsdorf wrote: > > But fortunately the workaround for the new inode.c bug is the same as > for the original bug: -fno-var-tracking-assignments. > > It would make sense to enabled it unconditionally for all debug > configurations for now. So how is cod

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Markus Trippelsdorf
On 2014.07.26 at 11:39 -0700, Linus Torvalds wrote: > On Sat, Jul 26, 2014 at 11:28 AM, Linus Torvalds > wrote: > > > > That's a bit worrisome. I haven't actually checked if the code > > generation differs in significant ways yet.. > > Nope. Just three instructions that got re-ordered from ABC to

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Theodore Ts'o
On Sat, Jul 26, 2014 at 09:35:57PM +0200, Markus Trippelsdorf wrote: > > But fortunately the workaround for the new inode.c bug is the same as > for the original bug: -fno-var-tracking-assignments. > > It would make sense to enabled it unconditionally for all debug > configurations for now. Wha

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Linus Torvalds
On Sat, Jul 26, 2014 at 11:28 AM, Linus Torvalds wrote: > > That's a bit worrisome. I haven't actually checked if the code > generation differs in significant ways yet.. Nope. Just three instructions that got re-ordered from ABC to CAB in a way that makes no difference. But just the knowledge tha

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Linus Torvalds
On Fri, Jul 25, 2014 at 11:29 AM, Linus Torvalds wrote: > > I'm sure it's possible, but it sounds potentially complicated. Hmm. The bugzilla entry just taught me a new gcc flag: "-fcompare-debug". That apparently makes gcc compile things twice, once with debugging and once without, and verify tha

Re: Random panic in load_balance() with 3.16-rc

2014-07-26 Thread Steven Chamberlain
Hi Michel, On 25/07/14 02:25, Michel Dänzer wrote: > Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm > going to try reproducing the problem with a kernel built by that now. It looks like gcc-4.9 Debian package version 4.9.1-2 available in sid/jessie may have already fixed t

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Jakub Jelinek
On Fri, Jul 25, 2014 at 01:01:11PM -0700, Linus Torvalds wrote: > For example, gcc will not create a small stack frame with "sub > $8,%rsp". No, what gcc does is to use a random "push" instruction. > Fair enough, but that really makes things much harder to see. Here's > an example: That is because

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Fri, 25 Jul 2014 13:01:11 -0700 Linus Torvalds wrote: > For example, gcc will not create a small stack frame with "sub > $8,%rsp". No, what gcc does is to use a random "push" instruction. > Fair enough, but that really makes things much harder to see. Here's > an example: > > 81314

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Linus Torvalds
On Fri, Jul 25, 2014 at 11:29 AM, Linus Torvalds wrote: > > Some simple pattern to make sure that the "sub $frame-size,%rsp" comes > before any accesses to (%rbp) (when frame pointers are enabled) > *might* work, but it might also end up missing things. You're going to have a hard time doing that

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Fri, 25 Jul 2014 11:29:06 -0700 Linus Torvalds wrote: > On Fri, Jul 25, 2014 at 7:02 AM, Steven Rostedt wrote: > > > > But wouldn't it be rather trivial to run a static analyzer on the final > > vmlinux to make sure there are no red zones? I mean, you would only need > > to read each function

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Linus Torvalds
On Fri, Jul 25, 2014 at 7:02 AM, Steven Rostedt wrote: > > But wouldn't it be rather trivial to run a static analyzer on the final > vmlinux to make sure there are no red zones? I mean, you would only need > to read each function and check to make sure that the offset of rbp is > within the change

Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Thu, Jul 24, 2014 at 08:55:28PM -0700, Alexei Starovoitov wrote: > > -mno-red-zone only affected prologue emition in gcc. This part didn't > change between the releases. So the bug is quite deep. > What seems to be happening is that 2nd pass of instruction scheduler > (after emit prologue and r

Re: Random panic in load_balance() with 3.16-rc

2014-07-24 Thread Alexei Starovoitov
On Fri, Jul 25, 2014 at 10:25:03AM +0900, Michel Dänzer wrote: > [ Adding the Debian kernel and gcc teams to Cc ] > > > movq$load_balance_mask, -136(%rbp) #, %sfp > > subq$184, %rsp #, > > > > Anyway, this is not a kernel bug. This is your compiler creating > > compl

Re: Random panic in load_balance() with 3.16-rc

2014-07-24 Thread Nick Krause
On Thu, Jul 24, 2014 at 11:55 PM, Alexei Starovoitov wrote: > On Fri, Jul 25, 2014 at 10:25:03AM +0900, Michel Dänzer wrote: >> [ Adding the Debian kernel and gcc teams to Cc ] >> >> > movq$load_balance_mask, -136(%rbp) #, %sfp >> > subq$184, %rsp #, >> > >> > Anyway,

Re: Random panic in load_balance() with 3.16-rc

2014-07-24 Thread Nick Krause
On Thu, Jul 24, 2014 at 10:33 PM, Linus Torvalds wrote: > On Thu, Jul 24, 2014 at 6:25 PM, Michel Dänzer wrote: >> >> Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm >> going to try reproducing the problem with a kernel built by that now. > > This looks better. For roughly

Re: Random panic in load_balance() with 3.16-rc

2014-07-24 Thread Nick Krause
On Thu, Jul 24, 2014 at 9:25 PM, Michel Dänzer wrote: > [ Adding the Debian kernel and gcc teams to Cc ] > > On 25.07.2014 03:47, Linus Torvalds wrote: >> On Wed, Jul 23, 2014 at 6:43 PM, Michel Dänzer wrote: Michel, mind doing make kernel/sched/fair.s and sendin

Re: Random panic in load_balance() with 3.16-rc

2014-07-24 Thread Linus Torvalds
On Thu, Jul 24, 2014 at 6:25 PM, Michel Dänzer wrote: > > Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm > going to try reproducing the problem with a kernel built by that now. This looks better. For roughly that same code sequence it does (ignoring the debug line and cfi