Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-05 Thread Austin Schuh
On Sat, Jul 5, 2014 at 1:26 PM, Thomas Gleixner wrote: > On Mon, 30 Jun 2014, Austin Schuh wrote: >> I think I might have an answer for my own question, but I would >> appreciate someone else to double check. If list_empty erroneously >> returns that there is work to do when there isn't work to d

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-05 Thread Thomas Gleixner
On Wed, 21 May 2014, John Blackwood wrote: > I'm not 100% sure that the patch below will fix your problem, but we > saw something that sounds pretty familiar to your issue involving the > nvidia driver and the preempt-rt patch. The nvidia driver uses the > completion support to create their own dr

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-05 Thread Thomas Gleixner
On Mon, 30 Jun 2014, Austin Schuh wrote: > I think I might have an answer for my own question, but I would > appreciate someone else to double check. If list_empty erroneously > returns that there is work to do when there isn't work to do, we wake > up an extra worker which then goes back to sleep

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-03 Thread Mike Galbraith
On Thu, 2014-07-03 at 16:08 -0700, Austin Schuh wrote: > On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh wrote: > > On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh > > wrote: > >> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner > >> wrote: > >>> Completely untested patch below. > > I've tested

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-03 Thread Austin Schuh
On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh wrote: > On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh wrote: >> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >>> Completely untested patch below. I've tested it and looked it over now, and feel pretty confident in the patch. Thanks Thom

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-07-01 Thread Austin Schuh
On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh wrote: > On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >> Completely untested patch below. > > By chance, I found this in my boot logs. I'll do some more startup > testing tomorrow. > > Jun 30 19:54:40 vpc5 kernel: [0.670955] --

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: > Completely untested patch below. By chance, I found this in my boot logs. I'll do some more startup testing tomorrow. Jun 30 19:54:40 vpc5 kernel: [0.670955] [ cut here ] Jun 30 19:54:40 vpc5 kernel: [0.67

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Mon, Jun 30, 2014 at 5:12 PM, Austin Schuh wrote: > On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: >> On Thu, 26 Jun 2014, Austin Schuh wrote: >>> If I'm reading the rt patch correctly, wq_worker_sleeping was moved >>> out of __schedule to sched_submit_work. It looks like that change

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-30 Thread Austin Schuh
On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner wrote: > On Thu, 26 Jun 2014, Austin Schuh wrote: >> If I'm reading the rt patch correctly, wq_worker_sleeping was moved >> out of __schedule to sched_submit_work. It looks like that changes >> the conditions under which wq_worker_sleeping is calle

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-28 Thread Mike Galbraith
On Fri, 2014-06-27 at 23:20 -0700, Austin Schuh wrote: > For workqueues, as long as the helper doesn't block on a lock which > requires the work queue to be freed up, it will eventually become > unblocked and make progress. The helper _should_ only need the pool > lock, which will wake the helper

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Austin Schuh
On Fri, Jun 27, 2014 at 8:32 PM, Mike Galbraith wrote: > On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote: > >> It would be more context switches, but I wonder if we could kick the >> workqueue logic completely out of the scheduler into a thread. Have >> the scheduler increment/decrement an

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Fri, 2014-06-27 at 16:24 +0200, Thomas Gleixner wrote: > Completely untested patch below. It's no longer completely untested, killer_module is no longer a killer. I'll let box (lockdep etc is enabled) chew on it a while, no news is good news as usual. -Mike -- To unsubscribe from this list:

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote: > It would be more context switches, but I wonder if we could kick the > workqueue logic completely out of the scheduler into a thread. Have > the scheduler increment/decrement an atomic pool counter, and wake up > the monitoring thread to sp

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Austin Schuh
On Fri, Jun 27, 2014 at 11:19 AM, Steven Rostedt wrote: > On Fri, 27 Jun 2014 20:07:54 +0200 > Mike Galbraith wrote: > >> > Why do we need the wakeup? the owner of the lock should wake it up >> > shouldn't it? >> >> True, but that can take ages. > > Can it? If the workqueue is of some higher prio

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Fri, 2014-06-27 at 14:19 -0400, Steven Rostedt wrote: > On Fri, 27 Jun 2014 20:07:54 +0200 > Mike Galbraith wrote: > > > > Why do we need the wakeup? the owner of the lock should wake it up > > > shouldn't it? > > > > True, but that can take ages. > > Can it? If the workqueue is of some hi

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Steven Rostedt
On Fri, 27 Jun 2014 20:07:54 +0200 Mike Galbraith wrote: > > Why do we need the wakeup? the owner of the lock should wake it up > > shouldn't it? > > True, but that can take ages. Can it? If the workqueue is of some higher priority, it should boost the process that owns the lock. Otherwise it

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Fri, 2014-06-27 at 13:54 -0400, Steven Rostedt wrote: > On Fri, 27 Jun 2014 19:34:53 +0200 > Mike Galbraith wrote: > > > On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote: > > > > > This seems like a lot of hacks. > > > > It is exactly that, lacking proper pooper-scooper, show rt kern

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Steven Rostedt
On Fri, 27 Jun 2014 19:34:53 +0200 Mike Galbraith wrote: > On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote: > > > This seems like a lot of hacks. > > It is exactly that, lacking proper pooper-scooper, show rt kernel how to > not step in it. > > > I'm wondering if it would work if we >

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote: > This seems like a lot of hacks. It is exactly that, lacking proper pooper-scooper, show rt kernel how to not step in it. > I'm wondering if it would work if we > just have the rt_spin_lock_slowlock not call schedule(), but call > __sched

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Thomas Gleixner
On Thu, 26 Jun 2014, Austin Schuh wrote: > If I'm reading the rt patch correctly, wq_worker_sleeping was moved > out of __schedule to sched_submit_work. It looks like that changes > the conditions under which wq_worker_sleeping is called. It used to > be called whenever a task was going to sleep

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Steven Rostedt
On Fri, 27 Jun 2014 14:57:36 +0200 Mike Galbraith wrote: > On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote: > > > I'm not sure where to go from there. Any changes to the workpool to > > try to fix that will be hard, or could affect latency significantly. > > Oh what the hell, I'm out of

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-27 Thread Mike Galbraith
On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote: > I'm not sure where to go from there. Any changes to the workpool to > try to fix that will be hard, or could affect latency significantly. Oh what the hell, I'm out of frozen shark, may as well stock up. Nobody else has posted spit yet.

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Mike Galbraith
On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote: > If I'm reading the rt patch correctly, wq_worker_sleeping was moved > out of __schedule to sched_submit_work. It looks like that changes > the conditions under which wq_worker_sleeping is called. It used to > be called whenever a task was

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Thu, Jun 26, 2014 at 3:35 PM, Thomas Gleixner wrote: > On Thu, 26 Jun 2014, Austin Schuh wrote: >> On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger >> wrote: >> > CC'ing RT folks >> > >> > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh >> > wrote: >> >> On Tue, May 13, 2014 at 7:29 PM, Au

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Thomas Gleixner
On Thu, 26 Jun 2014, Austin Schuh wrote: > On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger > wrote: > > CC'ing RT folks > > > > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh > > wrote: > >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh > >> wrote: > >>> Hi, > >>> > >>> I am observing a fi

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-06-26 Thread Austin Schuh
On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger wrote: > CC'ing RT folks > > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh wrote: >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh >> wrote: >>> Hi, >>> >>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT >>> patched kernel

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Austin Schuh
On Wed, May 21, 2014 at 12:30 PM, John Blackwood wrote: >> Date: Wed, 21 May 2014 03:33:49 -0400 >> From: Richard Weinberger >> To: Austin Schuh >> CC: LKML , xfs , rt-users >> >> Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT > >> >

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread John Blackwood
> Date: Wed, 21 May 2014 03:33:49 -0400 > From: Richard Weinberger > To: Austin Schuh > CC: LKML , xfs , rt-users > > Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT > > CC'ing RT folks > > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh wrote: &g

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-21 Thread Richard Weinberger
CC'ing RT folks On Wed, May 21, 2014 at 8:23 AM, Austin Schuh wrote: > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh wrote: >> Hi, >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT >> patched kernel. I have currently only triggered it using dpkg. Dave >> Chinner on the X

Re: Filesystem lockup with CONFIG_PREEMPT_RT

2014-05-20 Thread Austin Schuh
On Tue, May 13, 2014 at 7:29 PM, Austin Schuh wrote: > Hi, > > I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT > patched kernel. I have currently only triggered it using dpkg. Dave > Chinner on the XFS mailing list suggested that it was a rt-kernel > workqueue issue as opposed