Re: RCU stall leading to deadlock warning

2020-12-16 Thread Paul E. McKenney
On Wed, Dec 16, 2020 at 09:54:42AM -0800, Paul E. McKenney wrote: > On Wed, Dec 16, 2020 at 05:29:39PM +, Qais Yousef wrote: > > Hi Paul > > > > We hit the below splat a couple of days ago in our testing. Sadly I can't > > reproduce it. And it was on android-mainline branch.. > > > > It's the

Re: RCU stall leading to deadlock warning

2020-12-16 Thread Paul E. McKenney
On Wed, Dec 16, 2020 at 05:29:39PM +, Qais Yousef wrote: > Hi Paul > > We hit the below splat a couple of days ago in our testing. Sadly I can't > reproduce it. And it was on android-mainline branch.. > > It's the deadlock message that bothers me. I can't see how we could have ended > there.

Re: RCU stall leading to deadlock warning

2020-12-16 Thread Qais Yousef
On 12/16/20 10:00, Paul E. McKenney wrote: > On Wed, Dec 16, 2020 at 09:54:42AM -0800, Paul E. McKenney wrote: > > On Wed, Dec 16, 2020 at 05:29:39PM +, Qais Yousef wrote: > > > Hi Paul > > > > > > We hit the below splat a couple of days ago in our testing. Sadly I can't > > > reproduce it. An

Re: RCU stall in 8250 serial driver Linux 4.15-rc1

2018-01-24 Thread Andy Shevchenko
On Tue, Jan 23, 2018 at 5:52 PM, Alan Cox wrote: > On Wed, 17 Jan 2018 09:24:32 -0800 > Shankara Pailoor wrote: > >> Hi Greg, >> >> Sorry for that. Here is the stack trace. C Program below > > > >> serial_in drivers/tty/serial/8250/8250.h:111 [inline] >> wait_for_xmitr+0x8a/0x1d0 drivers/tty/se

Re: RCU stall in 8250 serial driver Linux 4.15-rc1

2018-01-23 Thread Alan Cox
On Wed, 17 Jan 2018 09:24:32 -0800 Shankara Pailoor wrote: > Hi Greg, > > Sorry for that. Here is the stack trace. C Program below > serial_in drivers/tty/serial/8250/8250.h:111 [inline] > wait_for_xmitr+0x8a/0x1d0 drivers/tty/serial/8250/8250_port.c:2033 > serial8250_console_putchar+0x19/

Re: RCU stall in 8250 serial driver Linux 4.15-rc1

2018-01-17 Thread Shankara Pailoor
Hi Greg, Sorry for that. Here is the stack trace. C Program below TCP: request_sock_TCP: Possible SYN flooding on port 20003. Sending cookies. Check SNMP counters. TCP: request_sock_TCP: Possible SYN flooding on port 20003. Sending cookies. Check SNMP counters. TCP: request_sock_TCP: Possible S

Re: RCU stall in 8250 serial driver Linux 4.15-rc1

2018-01-17 Thread Greg KH
On Wed, Jan 17, 2018 at 08:53:06AM -0800, Shankara Pailoor wrote: > Hi, > > Syzkaller found the following rcu stall report in Linux 4.15-rc1: > https://pastebin.com/NyZ9JdRv > > The following C program reproduces it: https://pastebin.com/gqwDWWpA > > Configs Here: https://pastebin.com/v6M3iKi1

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-13 Thread Paul E. McKenney
On Sun, Nov 12, 2017 at 07:30:08PM +0100, Bruno Prémont wrote: > On Sun, 12 Nov 2017 18:29:06 Bruno Prémont wrote: > > On Sun, 12 November 2017 "Paul E. McKenney" wrote: > > > On Sun, Nov 12, 2017 at 12:09:28PM +0100, Bruno Prémont wrote: > > > > On Sat, 11 November 2017 "Paul E. McKenney" wrote:

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-12 Thread Bruno Prémont
On Sun, 12 Nov 2017 18:29:06 Bruno Prémont wrote: > On Sun, 12 November 2017 "Paul E. McKenney" wrote: > > On Sun, Nov 12, 2017 at 12:09:28PM +0100, Bruno Prémont wrote: > > > On Sat, 11 November 2017 "Paul E. McKenney" wrote: > > > > On Sat, Nov 11, 2017 at 08:38:32PM +0100, Bruno Prémont wr

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-12 Thread Bruno Prémont
On Sun, 12 November 2017 "Paul E. McKenney" wrote: > On Sun, Nov 12, 2017 at 12:09:28PM +0100, Bruno Prémont wrote: > > On Sat, 11 November 2017 "Paul E. McKenney" > > wrote: > > > On Sat, Nov 11, 2017 at 08:38:32PM +0100, Bruno Prémont wrote: > > > > Hi, > > > > > > > > On a single-CPU KV

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-12 Thread Paul E. McKenney
On Sun, Nov 12, 2017 at 12:09:28PM +0100, Bruno Prémont wrote: > On Sat, 11 November 2017 "Paul E. McKenney" > wrote: > > On Sat, Nov 11, 2017 at 08:38:32PM +0100, Bruno Prémont wrote: > > > Hi, > > > > > > On a single-CPU KVM-based virtual machine I'm suffering from RCU stall > > > and soft-loc

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-12 Thread Bruno Prémont
On Sat, 11 November 2017 "Paul E. McKenney" wrote: > On Sat, Nov 11, 2017 at 08:38:32PM +0100, Bruno Prémont wrote: > > Hi, > > > > On a single-CPU KVM-based virtual machine I'm suffering from RCU stall > > and soft-lockup. 4.10.x kernels run fine (4.10.12) but starting with > > 4.11.x (4.11.3, 4

Re: RCU stall/SOFT-Lockup on 4.11.3/4.13.11 after multiple days uptime

2017-11-11 Thread Paul E. McKenney
On Sat, Nov 11, 2017 at 08:38:32PM +0100, Bruno Prémont wrote: > Hi, > > On a single-CPU KVM-based virtual machine I'm suffering from RCU stall > and soft-lockup. 4.10.x kernels run fine (4.10.12) but starting with > 4.11.x (4.11.3, 4.13.11) I'm getting system freezes for no apparent > reason. >

Re: RCU stall when using function_graph

2017-08-30 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 10:58:05AM -0700, Paul E. McKenney wrote: > On Wed, Aug 16, 2017 at 12:41:40PM -0400, Steven Rostedt wrote: > > On Wed, 16 Aug 2017 09:32:28 -0700 > > "Paul E. McKenney" wrote: > > > > > Let me see if I understand you... About halfway to the stall limit, > > > RCU trigger

Re: RCU stall when using function_graph

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 12:41:40PM -0400, Steven Rostedt wrote: > On Wed, 16 Aug 2017 09:32:28 -0700 > "Paul E. McKenney" wrote: > > > Let me see if I understand you... About halfway to the stall limit, > > RCU triggers an irq_work (on each CPU that has not yet passed through > > a quiescent sta

Re: RCU stall when using function_graph

2017-08-16 Thread Steven Rostedt
On Wed, 16 Aug 2017 09:32:28 -0700 "Paul E. McKenney" wrote: > Let me see if I understand you... About halfway to the stall limit, > RCU triggers an irq_work (on each CPU that has not yet passed through > a quiescent state, IPIing them in turn?), and if the irq_work has > not completed by the en

Re: RCU stall when using function_graph

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 10:04:21AM -0400, Steven Rostedt wrote: > On Wed, 16 Aug 2017 10:42:15 +0200 > Daniel Lezcano wrote: > > > Hi Steven, > > > > > > On 15/08/2017 15:29, Steven Rostedt wrote: > > > > > > [ I'm back from vacation! ] > > > > Did you get the tapes? :) > > Yes, but nothin

Re: RCU stall when using function_graph

2017-08-16 Thread Steven Rostedt
On Wed, 16 Aug 2017 10:42:15 +0200 Daniel Lezcano wrote: > Hi Steven, > > > On 15/08/2017 15:29, Steven Rostedt wrote: > > > > [ I'm back from vacation! ] > > Did you get the tapes? :) Yes, but nothing in them would cause the reputation of the POTUS to become any worse than it already is.

Re: RCU stall when using function_graph

2017-08-16 Thread Daniel Lezcano
Hi Steven, On 15/08/2017 15:29, Steven Rostedt wrote: > > [ I'm back from vacation! ] Did you get the tapes? :) > On Wed, 9 Aug 2017 17:51:33 +0200 > Daniel Lezcano wrote: > >> Well, may be the instruction pointer thing is not a good idea. >> >> I learnt from this experience, an overloaded

Re: RCU stall when using function_graph

2017-08-15 Thread Steven Rostedt
[ I'm back from vacation! ] On Wed, 9 Aug 2017 17:51:33 +0200 Daniel Lezcano wrote: > Well, may be the instruction pointer thing is not a good idea. > > I learnt from this experience, an overloaded kernel with a lot of > interrupts can hang the console and issue RCU stall. > > However, someon

Re: RCU stall when using function_graph

2017-08-11 Thread Daniel Lezcano
On 10/08/2017 23:39, Paul E. McKenney wrote: > On Thu, Aug 10, 2017 at 11:45:09AM +0200, Daniel Lezcano wrote: [ ... ] >> Nothing coming in mind but may be worth to mention the slowness of the >> CPU is the aggravating factor. In particular I was able to reproduce the >> issue by setting to the m

Re: RCU stall when using function_graph

2017-08-10 Thread Paul E. McKenney
On Thu, Aug 10, 2017 at 11:45:09AM +0200, Daniel Lezcano wrote: > On 09/08/2017 19:22, Paul E. McKenney wrote: > > On Wed, Aug 09, 2017 at 05:51:33PM +0200, Daniel Lezcano wrote: > >> On 09/08/2017 16:40, Paul E. McKenney wrote: > >>> On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote:

Re: RCU stall when using function_graph

2017-08-10 Thread Daniel Lezcano
On 09/08/2017 19:22, Paul E. McKenney wrote: > On Wed, Aug 09, 2017 at 05:51:33PM +0200, Daniel Lezcano wrote: >> On 09/08/2017 16:40, Paul E. McKenney wrote: >>> On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote: On 09/08/2017 14:58, Paul E. McKenney wrote: > On Wed, Aug 09,

Re: RCU stall when using function_graph

2017-08-09 Thread Paul E. McKenney
On Wed, Aug 09, 2017 at 05:51:33PM +0200, Daniel Lezcano wrote: > On 09/08/2017 16:40, Paul E. McKenney wrote: > > On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote: > >> On 09/08/2017 14:58, Paul E. McKenney wrote: > >>> On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote:

Re: RCU stall when using function_graph

2017-08-09 Thread Daniel Lezcano
On 09/08/2017 16:40, Paul E. McKenney wrote: > On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote: >> On 09/08/2017 14:58, Paul E. McKenney wrote: >>> On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote: On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote

Re: RCU stall when using function_graph

2017-08-09 Thread Paul E. McKenney
On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote: > On 09/08/2017 14:58, Paul E. McKenney wrote: > > On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote: > >> > >> > >> On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote: > >>> On Sat, Aug 05, 2017 at 02:24:21PM +0900

Re: RCU stall when using function_graph

2017-08-09 Thread Daniel Lezcano
On 09/08/2017 14:58, Paul E. McKenney wrote: > On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote: >> >> >> On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote: >>> On Sat, Aug 05, 2017 at 02:24:21PM +0900, 김동현 wrote: Dear All As for me, after configuring function_g

Re: RCU stall when using function_graph

2017-08-09 Thread Paul E. McKenney
On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote: > > > On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote: > >On Sat, Aug 05, 2017 at 02:24:21PM +0900, 김동현 wrote: > >>Dear All > >> > >>As for me, after configuring function_graph as below, crash disappears. > >>"echo 0 > d/tr

Re: RCU stall when using function_graph

2017-08-09 Thread Pratyush Anand
On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote: On Sat, Aug 05, 2017 at 02:24:21PM +0900, 김동현 wrote: Dear All As for me, after configuring function_graph as below, crash disappears. "echo 0 > d/tracing/tracing_on" "sleep 1" "echo function_graph > d/tracing/current_tracer" "sleep 1

Re: RCU stall when using function_graph

2017-08-06 Thread Paul E. McKenney
On Sat, Aug 05, 2017 at 02:24:21PM +0900, 김동현 wrote: > Dear All > > As for me, after configuring function_graph as below, crash disappears. > "echo 0 > d/tracing/tracing_on" > "sleep 1" > > "echo function_graph > d/tracing/current_tracer" > "sleep 1" > > "echo smp_call_function_single > d/tracin

Re: RCU stall when using function_graph

2017-08-03 Thread Daniel Lezcano
On Thu, Aug 03, 2017 at 05:44:21AM -0700, Paul E. McKenney wrote: [ ... ] > > > BTW, function_graph tracer is the most invasive of the tracers. It's 4x > > > slower than function tracer. I'm wondering if the tracer isn't the > > > cause, but just slows things down enough to cause a some other rac

Re: RCU stall when using function_graph

2017-08-03 Thread Paul E. McKenney
On Thu, Aug 03, 2017 at 01:41:11PM +0200, Daniel Lezcano wrote: > On 02/08/2017 15:07, Steven Rostedt wrote: > > On Wed, 2 Aug 2017 14:42:39 +0200 > > Daniel Lezcano wrote: > > > >> On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: > >>> On Wed, 2 Aug 2017 00:15:44 +0200 > >>> Danie

Re: RCU stall when using function_graph

2017-08-03 Thread Daniel Lezcano
On 02/08/2017 15:07, Steven Rostedt wrote: > On Wed, 2 Aug 2017 14:42:39 +0200 > Daniel Lezcano wrote: > >> On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: >>> On Wed, 2 Aug 2017 00:15:44 +0200 >>> Daniel Lezcano wrote: >>> On 02/08/2017 00:04, Paul E. McKenney wrote: >

Re: RCU stall when using function_graph

2017-08-02 Thread Paul E. McKenney
On Wed, Aug 02, 2017 at 09:07:44AM -0400, Steven Rostedt wrote: > On Wed, 2 Aug 2017 14:42:39 +0200 > Daniel Lezcano wrote: > > > On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: > > > On Wed, 2 Aug 2017 00:15:44 +0200 > > > Daniel Lezcano wrote: > > > > > > > On 02/08/2017 00:

Re: RCU stall when using function_graph

2017-08-02 Thread Paul E. McKenney
On Wed, Aug 02, 2017 at 02:42:39PM +0200, Daniel Lezcano wrote: > On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: > > On Wed, 2 Aug 2017 00:15:44 +0200 > > Daniel Lezcano wrote: > > > > > On 02/08/2017 00:04, Paul E. McKenney wrote: > > > >> Hi Paul, > > > >> > > > >> I have been

Re: RCU stall when using function_graph

2017-08-02 Thread Steven Rostedt
On Wed, 2 Aug 2017 14:42:39 +0200 Daniel Lezcano wrote: > On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: > > On Wed, 2 Aug 2017 00:15:44 +0200 > > Daniel Lezcano wrote: > > > > > On 02/08/2017 00:04, Paul E. McKenney wrote: > > > >> Hi Paul, > > > >> > > > >> I have been tr

Re: RCU stall when using function_graph

2017-08-02 Thread Paul E. McKenney
On Tue, Aug 01, 2017 at 03:04:05PM -0700, Paul E. McKenney wrote: > > Hi Paul, > > > > I have been trying to set the function_graph tracer for ftrace and each > > time I > > get a CPU stall. > > > > How to reproduce: > > - > > > > echo function_graph > /sys/kernel/d

Re: RCU stall when using function_graph

2017-08-02 Thread Daniel Lezcano
On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote: > On Wed, 2 Aug 2017 00:15:44 +0200 > Daniel Lezcano wrote: > > > On 02/08/2017 00:04, Paul E. McKenney wrote: > > >> Hi Paul, > > >> > > >> I have been trying to set the function_graph tracer for ftrace and each > > >> time I > > >

Re: RCU stall when using function_graph

2017-08-01 Thread Steven Rostedt
On Wed, 2 Aug 2017 00:15:44 +0200 Daniel Lezcano wrote: > On 02/08/2017 00:04, Paul E. McKenney wrote: > >> Hi Paul, > >> > >> I have been trying to set the function_graph tracer for ftrace and each > >> time I > >> get a CPU stall. > >> > >> How to reproduce: > >> - > >> > >>

Re: RCU stall when using function_graph

2017-08-01 Thread Daniel Lezcano
On 02/08/2017 00:04, Paul E. McKenney wrote: >> Hi Paul, >> >> I have been trying to set the function_graph tracer for ftrace and each time >> I >> get a CPU stall. >> >> How to reproduce: >> - >> >> echo function_graph > /sys/kernel/debug/tracing/current_tracer >> >>

Re: RCU stall when using function_graph

2017-08-01 Thread Paul E. McKenney
> Hi Paul, > > I have been trying to set the function_graph tracer for ftrace and each time I > get a CPU stall. > > How to reproduce: > - > >echo function_graph > /sys/kernel/debug/tracing/current_tracer > > This error appears with v4.13-rc3 and v4.12-rc6. > >

Re: RCU stall warnings...

2017-07-24 Thread Stephen Rothwell
Hi Dave, On Mon, 24 Jul 2017 16:34:58 -0700 (PDT) David Miller wrote: > > Shoing my ignorance as well, after reading this, for some reason this > commit below sticks out to me. Maybe I should do a bisect and see if > it lands on this commit. > > That would take a while as it's hard to forcibly

Re: RCU stall warnings...

2017-07-24 Thread Paul E. McKenney
On Mon, Jul 24, 2017 at 04:49:27PM -0700, Paul E. McKenney wrote: > On Mon, Jul 24, 2017 at 04:34:58PM -0700, David Miller wrote: > > From: "Paul E. McKenney" > > Date: Mon, 24 Jul 2017 16:20:33 -0700 [ . . . ] > > That would take a while as it's hard to forcibly set this thing off. > > And my

Re: RCU stall warnings...

2017-07-24 Thread Paul E. McKenney
On Mon, Jul 24, 2017 at 04:34:58PM -0700, David Miller wrote: > From: "Paul E. McKenney" > Date: Mon, 24 Jul 2017 16:20:33 -0700 > > > It looks like the system isn't letting the rcu_sched grace-period kthread > > run: > > > > [402138.240512] rcu_sched kthread starved for 2757 jiffies! g53669 c53

Re: RCU stall warnings...

2017-07-24 Thread David Miller
From: "Paul E. McKenney" Date: Mon, 24 Jul 2017 16:20:33 -0700 > It looks like the system isn't letting the rcu_sched grace-period kthread > run: > > [402138.240512] rcu_sched kthread starved for 2757 jiffies! g53669 c53668 > f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 > > This kthread tried to wait f

Re: RCU stall warnings...

2017-07-24 Thread Paul E. McKenney
On Mon, Jul 24, 2017 at 03:32:48PM -0700, David Miller wrote: > > Paul and other RCU experts, > > Starting with 4.13-rc1 we're getting RCU stall dumps on sparc64, they > definitely didn't happen in 4.12 > > I tried to look for low hanging fruit in the kernel/rcu/ changes this > merge window, but

Re: RCU stall

2016-03-24 Thread Paul E. McKenney
On Thu, Mar 24, 2016 at 01:24:02PM -0700, Bart Van Assche wrote: > On 03/22/2016 07:29 PM, Paul E. McKenney wrote: > >Note that a soft lockup triggered at 10509.568010, well before the RCU > >CPU stall warning.. And you have a second soft lockup at 10537.567212, > >with the same funtion scsi_reque

Re: RCU stall

2016-03-24 Thread Bart Van Assche
On 03/22/2016 07:29 PM, Paul E. McKenney wrote: Note that a soft lockup triggered at 10509.568010, well before the RCU CPU stall warning.. And you have a second soft lockup at 10537.567212, with the same funtion scsi_request_fn() at the top of the stack in both stack traces. That function has a

Re: RCU stall

2016-03-22 Thread Paul E. McKenney
On Tue, Mar 22, 2016 at 06:59:32PM -0700, Paul E. McKenney wrote: > On Tue, Mar 22, 2016 at 04:53:26PM -0700, Bart Van Assche wrote: > > On 03/22/2016 01:45 PM, Paul E. McKenney wrote: > > >You are getting a soft lockup as well as an RCU CPU stall warning, so > > >it looks like something is taking

Re: RCU stall

2016-03-22 Thread Paul E. McKenney
On Tue, Mar 22, 2016 at 04:53:26PM -0700, Bart Van Assche wrote: > On 03/22/2016 01:45 PM, Paul E. McKenney wrote: > >You are getting a soft lockup as well as an RCU CPU stall warning, so > >it looks like something is taking a very long time in blk_done_softirq(). > > > >You have multiple occurrenc

Re: RCU stall

2016-03-22 Thread Bart Van Assche
On 03/22/2016 01:45 PM, Paul E. McKenney wrote: You are getting a soft lockup as well as an RCU CPU stall warning, so it looks like something is taking a very long time in blk_done_softirq(). You have multiple occurrences at different times, so it looks to be a long time as opposed to an infinit

Re: RCU stall and the system boot hang with nfsroot

2016-01-05 Thread Paul E. McKenney
On Tue, Jan 05, 2016 at 03:57:54PM +0800, Aaron Ma wrote: > On Tue, Jan 5, 2016 at 5:18 AM, Paul E. McKenney > wrote: > > On Mon, Jan 04, 2016 at 06:01:37PM +0800, Aaron Ma wrote: > >> On Fri, Jan 1, 2016 at 3:49 AM, Paul E. McKenney > >> wrote: > >> > On Wed, Dec 30, 2015 at 09:41:45AM -0800, Pa

Re: RCU stall and the system boot hang with nfsroot

2016-01-04 Thread Aaron Ma
On Tue, Jan 5, 2016 at 5:18 AM, Paul E. McKenney wrote: > On Mon, Jan 04, 2016 at 06:01:37PM +0800, Aaron Ma wrote: >> On Fri, Jan 1, 2016 at 3:49 AM, Paul E. McKenney >> wrote: >> > On Wed, Dec 30, 2015 at 09:41:45AM -0800, Paul E. McKenney wrote: >> >> On Wed, Dec 30, 2015 at 03:03:33PM +0800,

Re: RCU stall and the system boot hang with nfsroot

2016-01-04 Thread Paul E. McKenney
On Mon, Jan 04, 2016 at 06:01:37PM +0800, Aaron Ma wrote: > On Fri, Jan 1, 2016 at 3:49 AM, Paul E. McKenney > wrote: > > On Wed, Dec 30, 2015 at 09:41:45AM -0800, Paul E. McKenney wrote: > >> On Wed, Dec 30, 2015 at 03:03:33PM +0800, Aaron Ma wrote: > >> > On Wed, Dec 30, 2015 at 7:42 AM, Paul E.

Re: RCU stall and the system boot hang with nfsroot

2016-01-04 Thread Aaron Ma
On Fri, Jan 1, 2016 at 3:49 AM, Paul E. McKenney wrote: > On Wed, Dec 30, 2015 at 09:41:45AM -0800, Paul E. McKenney wrote: >> On Wed, Dec 30, 2015 at 03:03:33PM +0800, Aaron Ma wrote: >> > On Wed, Dec 30, 2015 at 7:42 AM, Paul E. McKenney >> > wrote: > > [ . . . ] > >> > cfg80211: Calling CRDA t

Re: RCU stall and the system boot hang with nfsroot

2015-12-31 Thread Paul E. McKenney
On Wed, Dec 30, 2015 at 09:41:45AM -0800, Paul E. McKenney wrote: > On Wed, Dec 30, 2015 at 03:03:33PM +0800, Aaron Ma wrote: > > On Wed, Dec 30, 2015 at 7:42 AM, Paul E. McKenney > > wrote: [ . . . ] > > cfg80211: Calling CRDA to update world regulatory domain > > cfg80211: Calling CRDA to upda

Re: RCU stall and the system boot hang with nfsroot

2015-12-30 Thread Paul E. McKenney
On Wed, Dec 30, 2015 at 03:03:33PM +0800, Aaron Ma wrote: > On Wed, Dec 30, 2015 at 7:42 AM, Paul E. McKenney > wrote: > > On Tue, Dec 29, 2015 at 05:34:38PM +0800, Aaron Ma wrote: > >> Add paul...@linux.vnet.ibm.com > >> > >> On Tue, Dec 29, 2015 at 5:32 PM, Aaron Ma wrote: > >> > Hi, Paul: > >>

Re: RCU stall and the system boot hang with nfsroot

2015-12-29 Thread Aaron Ma
On Wed, Dec 30, 2015 at 7:42 AM, Paul E. McKenney wrote: > On Tue, Dec 29, 2015 at 05:34:38PM +0800, Aaron Ma wrote: >> Add paul...@linux.vnet.ibm.com >> >> On Tue, Dec 29, 2015 at 5:32 PM, Aaron Ma wrote: >> > Hi, Paul: >> > I found the linux-stable-4.1.15 with rt15 patches boot hang sometimes.

Re: RCU stall and the system boot hang with nfsroot

2015-12-29 Thread Paul E. McKenney
On Tue, Dec 29, 2015 at 05:34:38PM +0800, Aaron Ma wrote: > Add paul...@linux.vnet.ibm.com > > On Tue, Dec 29, 2015 at 5:32 PM, Aaron Ma wrote: > > Hi, Paul: > > I found the linux-stable-4.1.15 with rt15 patches boot hang sometimes. > > Hardware is Grantley-EP and WildcatPass. I must confess tha

Re: RCU stall and the system boot hang with nfsroot

2015-12-29 Thread Aaron Ma
Add paul...@linux.vnet.ibm.com On Tue, Dec 29, 2015 at 5:32 PM, Aaron Ma wrote: > Hi, Paul: > I found the linux-stable-4.1.15 with rt15 patches boot hang sometimes. > Hardware is Grantley-EP and WildcatPass. > No response by sysrq. > > Did you found any issue about this? Or how can I address this

Re: RCU stall and the system boot hang

2015-12-01 Thread Paul E. McKenney
On Mon, Nov 30, 2015 at 09:19:18AM -0800, Paul E. McKenney wrote: > On Mon, Nov 30, 2015 at 02:54:13PM +0800, fupan li wrote: [ . . . ] > > No, just a normal boot, and these stalls were happened before > > systemd services running. > > Interesting. My testing show v4.1 being OK, with the first

Re: RCU stall and the system boot hang

2015-11-30 Thread Paul E. McKenney
On Mon, Nov 30, 2015 at 02:54:13PM +0800, fupan li wrote: > 2015-11-29 14:05 GMT+08:00 Paul E. McKenney : > > > On Sun, Nov 29, 2015 at 12:46:10PM +0800, fupan li wrote: > > > 2015-11-28 22:53 GMT+08:00 Paul E. McKenney > >: > > > > > > > On Sat, Nov 28, 2015 at 01:05:52PM +0800, fupan li wrote:

Re: RCU stall and the system boot hang

2015-11-28 Thread Paul E. McKenney
On Sun, Nov 29, 2015 at 12:46:10PM +0800, fupan li wrote: > 2015-11-28 22:53 GMT+08:00 Paul E. McKenney : > > > On Sat, Nov 28, 2015 at 01:05:52PM +0800, fupan li wrote: > > > 2015-11-28 0:28 GMT+08:00 Paul E. McKenney : > > > > > > > On Fri, Nov 27, 2015 at 08:23:24PM +0800, fupan li wrote: > > >

Re: RCU stall and the system boot hang

2015-11-28 Thread Paul E. McKenney
On Sat, Nov 28, 2015 at 01:05:52PM +0800, fupan li wrote: > 2015-11-28 0:28 GMT+08:00 Paul E. McKenney : > > > On Fri, Nov 27, 2015 at 08:23:24PM +0800, fupan li wrote: > > > Hi, Paul > > > > > > On my Wildcat_Pass (Haswell) board, the system boot will hang as below. > > > The kernel is preempt-rt

Re: RCU stall and the system boot hang

2015-11-27 Thread Paul E. McKenney
On Fri, Nov 27, 2015 at 08:23:24PM +0800, fupan li wrote: > Hi, Paul > > On my Wildcat_Pass (Haswell) board, the system boot will hang as below. > The kernel is preempt-rt kernel. > But it was not reproduced every time, about 1 time per 5-10 boots. CCing LMKL and linux-rt-users in case anyone els

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Thomas Petazzoni
Dear Eric Dumazet, On Tue, 21 Oct 2014 03:28:20 -0700, Eric Dumazet wrote: > > Ok. So it's actually safe to mix spin_lock() and spin_lock_irqsave() on > > the same lock, if you know that this lock will never ever be taken in > > an interrupt context? > > Sure. Ok, thanks. > > > mvpp2 is seriou

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Eric Dumazet
On Tue, 2014-10-21 at 12:10 +0200, Thomas Petazzoni wrote: > Ok. So it's actually safe to mix spin_lock() and spin_lock_irqsave() on > the same lock, if you know that this lock will never ever be taken in > an interrupt context? Sure. > > > mvpp2 is seriously brain damaged : on_each_cpu() canno

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Thomas Petazzoni
Dear Hannes Frederic Sowa, On Tue, 21 Oct 2014 12:08:52 +0200, Hannes Frederic Sowa wrote: > On Di, 2014-10-21 at 10:03 +0200, Thomas Petazzoni wrote: > > So, the question is: is this patch the correct solution (but then other > > usage of spin_lock in af_unix.c might also need fixing) ? Or is the

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Thomas Petazzoni
Dear Eric Dumazet, On Tue, 21 Oct 2014 03:04:34 -0700, Eric Dumazet wrote: > > So, the question is: is this patch the correct solution (but then other > > usage of spin_lock in af_unix.c might also need fixing) ? Or is the > > network driver at fault? > > > > Thanks for your input, > > > > Thom

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Hannes Frederic Sowa
On Di, 2014-10-21 at 10:03 +0200, Thomas Petazzoni wrote: > So, the question is: is this patch the correct solution (but then other > usage of spin_lock in af_unix.c might also need fixing) ? Or is the > network driver at fault? It feels like a false positive. Do you see one core spinning tightly

Re: RCU stall in af_unix.c, should use spin_lock_irqsave?

2014-10-21 Thread Eric Dumazet
On Tue, 2014-10-21 at 10:03 +0200, Thomas Petazzoni wrote: > Hello, > > I stumbled across a reproducible RCU stall related to the AF_UNIX code > (on 3.17, on an ARM SMP system), and I'm not sure whether the problem > is caused by: > > * The af_unix.c code using spin_lock() on sk->sk_receive_queu

Re: rcu stall warning, again.

2013-08-07 Thread Paul E. McKenney
On Wed, Aug 07, 2013 at 01:05:11AM -0400, Dave Jones wrote: > Still seeing these (though not as frequently) > > > INFO: rcu_preempt self-detected stall on CPU { 2} (t=6500 jiffies g=4433279 > c=4433278 q=0) > sending NMI to all CPUs: > NMI backtrace for cpu 0 > CPU: 0 PID: 0 Comm: swapper/0 Not