Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Tejun Heo
Hello, Mike. On Tue, Feb 09, 2016 at 07:02:35PM +0100, Mike Galbraith wrote: > > It doesn't do anything unless the user twiddles the mask to exclude > > certain (think no_hz_full) CPUs, so there are no clueless victims. > > (a plus: testers/robots can twiddle mask to help find bugs, _and_ > nohz_

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Linus Torvalds
On Tue, Feb 9, 2016 at 9:51 AM, Tejun Heo wrote: >> >> (a) actually dequeue timers and work queues that are bound to a >> particular CPU when a CPU goes down. >> > This goes the same for work items and timers. If we want to do > explicit dequeueing or flushing of cpu-bound stuff on cpu down, we'

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Mike Galbraith
On Tue, 2016-02-09 at 18:56 +0100, Mike Galbraith wrote: > On Tue, 2016-02-09 at 12:54 -0500, Tejun Heo wrote: > > Hello, Mike. > > > > On Tue, Feb 09, 2016 at 06:04:04PM +0100, Mike Galbraith wrote: > > > workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask > > > CPUs > > > > > > WORK

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Mike Galbraith
On Tue, 2016-02-09 at 12:54 -0500, Tejun Heo wrote: > Hello, Mike. > > On Tue, Feb 09, 2016 at 06:04:04PM +0100, Mike Galbraith wrote: > > workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask > > CPUs > > > > WORK_CPU_UNBOUND work items queued to a bound workqueue always run > > locall

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Tejun Heo
Hello, Mike. On Tue, Feb 09, 2016 at 06:04:04PM +0100, Mike Galbraith wrote: > workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs > > WORK_CPU_UNBOUND work items queued to a bound workqueue always run > locally. This is a good thing normally, but not when the user has > asked u

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Tejun Heo
Hello, On Tue, Feb 09, 2016 at 09:04:18AM -0800, Linus Torvalds wrote: > On Tue, Feb 9, 2016 at 8:50 AM, Tejun Heo wrote: > > idk, not doing so is likely to cause subtle bugs which are difficult > > to track down. The problem with -stable is 874bbfe6 being backported > > without the matching tim

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Mike Galbraith
On Tue, 2016-02-09 at 11:50 -0500, Tejun Heo wrote: > Hello, > > On Tue, Feb 09, 2016 at 08:39:15AM -0800, Linus Torvalds wrote: > > > A niggling question remaining is when is it gonna be killed? > > > > It probably should be killed sooner rather than later. > > > > Just document that if you nee

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Linus Torvalds
On Tue, Feb 9, 2016 at 8:50 AM, Tejun Heo wrote: > > idk, not doing so is likely to cause subtle bugs which are difficult > to track down. The problem with -stable is 874bbfe6 being backported > without the matching timer fix. Well, according to this thread, even witht he timer fix the end resul

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Tejun Heo
Hello, On Tue, Feb 09, 2016 at 08:39:15AM -0800, Linus Torvalds wrote: > > A niggling question remaining is when is it gonna be killed? > > It probably should be killed sooner rather than later. > > Just document that if you need something to run on a _particular_ cpu, > you need to use "schedul

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Linus Torvalds
On Tue, Feb 9, 2016 at 7:31 AM, Mike Galbraith wrote: > On Fri, 2016-02-05 at 16:06 -0500, Tejun Heo wrote: >> > >> > That 874bbfe6 should die. >> >> Yeah, it's gonna be killed. The commit is there because the behavior >> change broke things. We don't want to guarantee it but have been and >> ca

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-09 Thread Mike Galbraith
On Fri, 2016-02-05 at 16:06 -0500, Tejun Heo wrote: > On Fri, Feb 05, 2016 at 09:59:49PM +0100, Mike Galbraith wrote: > > On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote: > > > > > What are you suggesting? > > > > That 874bbfe6 should die. > > Yeah, it's gonna be killed. The commit is there

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-06 Thread Mike Galbraith
On Sun, 2016-02-07 at 06:19 +0100, Mike Galbraith wrote: > On Sat, 2016-02-06 at 11:07 -0200, Henrique de Moraes Holschuh wrote: > > On Fri, 05 Feb 2016, Tejun Heo wrote: > > > On Fri, Feb 05, 2016 at 09:59:49PM +0100, Mike Galbraith wrote: > > > > On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-06 Thread Mike Galbraith
On Sat, 2016-02-06 at 11:07 -0200, Henrique de Moraes Holschuh wrote: > On Fri, 05 Feb 2016, Tejun Heo wrote: > > On Fri, Feb 05, 2016 at 09:59:49PM +0100, Mike Galbraith wrote: > > > On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote: > > > > > > > What are you suggesting? > > > > > > That 874bb

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-06 Thread Henrique de Moraes Holschuh
On Fri, 05 Feb 2016, Tejun Heo wrote: > On Fri, Feb 05, 2016 at 09:59:49PM +0100, Mike Galbraith wrote: > > On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote: > > > > > What are you suggesting? > > > > That 874bbfe6 should die. > > Yeah, it's gonna be killed. The commit is there because the be

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Tejun Heo
On Fri, Feb 05, 2016 at 09:59:49PM +0100, Mike Galbraith wrote: > On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote: > > > What are you suggesting? > > That 874bbfe6 should die. Yeah, it's gonna be killed. The commit is there because the behavior change broke things. We don't want to guarante

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Mike Galbraith
On Fri, 2016-02-05 at 15:54 -0500, Tejun Heo wrote: > What are you suggesting? That 874bbfe6 should die. -Mike

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Tejun Heo
Hello, Mike. On Fri, Feb 05, 2016 at 09:47:11PM +0100, Mike Galbraith wrote: > That very point is what makes it wrong for the workqueue code to ever > target a work item. The instant it does target selection, correctness > may be at stake, it doesn't know, thus it must assume the full onus, > whi

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Mike Galbraith
On Fri, 2016-02-05 at 11:49 -0500, Tejun Heo wrote: > Hello, Mike. > > On Thu, Feb 04, 2016 at 03:00:17AM +0100, Mike Galbraith wrote: > > Isn't it the case that, currently at least, each and every spot that > > requires execution on a specific CPU yet does not take active measures > > to deal wit

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Tejun Heo
Hello, Mike. On Thu, Feb 04, 2016 at 03:00:17AM +0100, Mike Galbraith wrote: > Isn't it the case that, currently at least, each and every spot that > requires execution on a specific CPU yet does not take active measures > to deal with hotplug events is in fact buggy? The timer code clearly > sta

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Mike Galbraith
On Fri, 2016-02-05 at 09:11 +0100, Daniel Bilik wrote: > On Fri, 05 Feb 2016 03:40:46 +0100 > Mike Galbraith wrote: > > IMHO you should restore the CC list and re-post. (If I were the > > maintainer of either the workqueue code or 3.18-stable, I'd be highly > > interested in this finding). > >

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-05 Thread Daniel Bilik
On Fri, 05 Feb 2016 03:40:46 +0100 Mike Galbraith wrote: > On Thu, 2016-02-04 at 17:39 +0100, Daniel Bilik wrote: > > On Thu, 4 Feb 2016 12:20:44 +0100 > > Jan Kara wrote: > > > > > Thanks for backport Thomas and to Mike for persistence :). I've > > > asked my friend seeing crashes with 3.18.25

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Mike Galbraith
On Wed, 2016-02-03 at 11:24 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 01:28:56PM +0100, Michal Hocko wrote: > > > The CPU was 168, and that one was offlined in the meantime. So > > > __queue_work fails at: > > > if (!(wq->flags & WQ_UNBOUND)) > > > pwq = per_cpu_ptr(wq->cpu_pwqs, cpu)

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Mike Galbraith
On Thu, 2016-02-04 at 17:39 +0100, Daniel Bilik wrote: > On Thu, 4 Feb 2016 12:20:44 +0100 > Jan Kara wrote: > > > Thanks for backport Thomas and to Mike for persistence :). I've asked my > > friend seeing crashes with 3.18.25 to try whether this patch fixes the > > issues. It may take some time

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Daniel Bilik
On Thu, 4 Feb 2016 12:20:44 +0100 Jan Kara wrote: > Thanks for backport Thomas and to Mike for persistence :). I've asked my > friend seeing crashes with 3.18.25 to try whether this patch fixes the > issues. It may take some time so stay tuned... Patch tested and it really fixes the crash we wer

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Jan Kara
On Thu 04-02-16 11:46:47, Thomas Gleixner wrote: > On Thu, 4 Feb 2016, Mike Galbraith wrote: > > On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > > > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > > > Hm, so it's ok to queue work to an offline CPU? What happens if it > >

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Mike Galbraith
On Thu, 2016-02-04 at 11:46 +0100, Thomas Gleixner wrote: > On Thu, 4 Feb 2016, Mike Galbraith wrote: > > I'm also wondering why 22b886dd only applies to kernels >= 4.2. > > > > > > Regardless of the previous CPU a timer was on, add_timer_on() > > currently simply sets timer->flags to the new CP

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Thomas Gleixner
On Thu, 4 Feb 2016, Mike Galbraith wrote: > On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > > Hm, so it's ok to queue work to an offline CPU? What happens if it > > > doesn't come back for an eternity or two? > > > > Righ

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-04 Thread Mike Galbraith
On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > Hm, so it's ok to queue work to an offline CPU? What happens if it > > doesn't come back for an eternity or two? > > Right now, it just loses affinity WRT affinity... So

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Michal Hocko
On Thu 04-02-16 07:37:23, Michal Hocko wrote: > On Wed 03-02-16 11:59:01, Tejun Heo wrote: > > On Wed, Feb 03, 2016 at 05:48:52PM +0100, Michal Hocko wrote: > [...] > > > anything and add_timer_on also for WORK_CPU_UNBOUND is really required > > > then we should at least preserve WORK_CPU_UNBOUND i

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Michal Hocko
On Wed 03-02-16 11:59:01, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 05:48:52PM +0100, Michal Hocko wrote: [...] > > anything and add_timer_on also for WORK_CPU_UNBOUND is really required > > then we should at least preserve WORK_CPU_UNBOUND in dwork->cpu so that > > __queue_work can actually move

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Mike Galbraith
On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > Hm, so it's ok to queue work to an offline CPU? What happens if it > > doesn't come back for an eternity or two? > > Right now, it just loses affinity. A more interesting cas

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
On Wed, Feb 03, 2016 at 08:05:57PM +0100, Thomas Gleixner wrote: > > Well, you're in an unnecessary escalation mode as usual. Was the > > attitude really necessary? Chill out and read the thread again. > > Michal is saying the dwork->cpu assignment was bogus and I was > > refuting that. > > Right

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Thomas Gleixner
On Wed, 3 Feb 2016, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 07:46:11PM +0100, Thomas Gleixner wrote: > > > > So I think 874bbfe600a6 is really bogus. It should be reverted. We > > > > already have a proper fix for vmstat 176bed1de5bf ("vmstat: explicitly > > > > schedule per-cpu work on the CPU

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
Hello, Thomas. On Wed, Feb 03, 2016 at 07:46:11PM +0100, Thomas Gleixner wrote: > > > So I think 874bbfe600a6 is really bogus. It should be reverted. We > > > already have a proper fix for vmstat 176bed1de5bf ("vmstat: explicitly > > > schedule per-cpu work on the CPU we need it to run on"). This

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Thomas Gleixner
On Wed, 3 Feb 2016, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 01:28:56PM +0100, Michal Hocko wrote: > > > The CPU was 168, and that one was offlined in the meantime. So > > > __queue_work fails at: > > > if (!(wq->flags & WQ_UNBOUND)) > > > pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); > > > else

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
On Wed, Feb 03, 2016 at 06:13:15PM +0100, Mike Galbraith wrote: > Ah, and the rest (the vast majority) can then be safely deflected away > from nohz_full cpus. Yeap, it should be possible to bounce majority of work items across CPUs all we want. -- tejun

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Mike Galbraith
On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > Hm, so it's ok to queue work to an offline CPU? What happens if it > > doesn't come back for an eternity or two? > > Right now, it just loses affinity. A more interesting cas

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
On Wed, Feb 03, 2016 at 05:48:52PM +0100, Michal Hocko wrote: > > So, the proper fix here is keeping cpu <-> node mapping stable across > > cpu on/offlining which has been being worked on for a long time now. > > The patchst is pending and it fixes other issues too. > > What if that node was memor

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > Hm, so it's ok to queue work to an offline CPU? What happens if it > doesn't come back for an eternity or two? Right now, it just loses affinity. A more interesting case is a cpu going offline whlie work items bound to the cpu are

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Mike Galbraith
On Wed, 2016-02-03 at 11:24 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 01:28:56PM +0100, Michal Hocko wrote: > > > The CPU was 168, and that one was offlined in the meantime. So > > > __queue_work fails at: > > > if (!(wq->flags & WQ_UNBOUND)) > > > pwq = per_cpu_ptr(wq->cpu_pwqs, cpu)

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Michal Hocko
On Wed 03-02-16 11:24:41, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 01:28:56PM +0100, Michal Hocko wrote: > > > The CPU was 168, and that one was offlined in the meantime. So > > > __queue_work fails at: > > > if (!(wq->flags & WQ_UNBOUND)) > > > pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); > > >

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Tejun Heo
On Wed, Feb 03, 2016 at 01:28:56PM +0100, Michal Hocko wrote: > > The CPU was 168, and that one was offlined in the meantime. So > > __queue_work fails at: > > if (!(wq->flags & WQ_UNBOUND)) > > pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); > > else > > pwq = unbound_pwq_by_node(wq, cpu_to_node

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Michal Hocko
[I wasn't aware of this email thread before so I am jumping in late] On Wed 03-02-16 10:35:32, Jiri Slaby wrote: > On 01/26/2016, 02:09 PM, Thomas Gleixner wrote: > > On Tue, 26 Jan 2016, Petr Mladek wrote: [...] > >> The commit 874bbfe600a6 ("workqueue: make sure delayed work run in > >> local cp

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Thomas Gleixner
On Wed, 3 Feb 2016, Jiri Slaby wrote: > On 01/26/2016, 02:09 PM, Thomas Gleixner wrote: > What happens in later kernels, when the cpu is offlined before the > delayed_work timer ticks? In stable 3.12, with the patch, this scenario > results in an oops: > #5 [8c03fdd63d80] page_fault at ff

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-02-03 Thread Jiri Slaby
On 01/26/2016, 02:09 PM, Thomas Gleixner wrote: > On Tue, 26 Jan 2016, Petr Mladek wrote: >> On Tue 2016-01-26 10:34:00, Jan Kara wrote: >>> On Sat 23-01-16 17:11:54, Thomas Gleixner wrote: On Sat, 23 Jan 2016, Ben Hutchings wrote: > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote:

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-26 Thread Thomas Gleixner
On Tue, 26 Jan 2016, Petr Mladek wrote: > On Tue 2016-01-26 10:34:00, Jan Kara wrote: > > On Sat 23-01-16 17:11:54, Thomas Gleixner wrote: > > > On Sat, 23 Jan 2016, Ben Hutchings wrote: > > > > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > > > > > > Looks like it requires more than trivial

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-26 Thread Petr Mladek
On Tue 2016-01-26 10:34:00, Jan Kara wrote: > On Sat 23-01-16 17:11:54, Thomas Gleixner wrote: > > On Sat, 23 Jan 2016, Ben Hutchings wrote: > > > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > > > > > Looks like it requires more than trivial backport (I think). Tejun? > > > > > > > > The t

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-26 Thread Thomas Gleixner
On Tue, 26 Jan 2016, Jan Kara wrote: > On Sat 23-01-16 17:11:54, Thomas Gleixner wrote: > > On Sat, 23 Jan 2016, Ben Hutchings wrote: > > > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > > > > > Looks like it requires more than trivial backport (I think). Tejun? > > > > > > > > The timer mi

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-26 Thread Jan Kara
On Sat 23-01-16 17:11:54, Thomas Gleixner wrote: > On Sat, 23 Jan 2016, Ben Hutchings wrote: > > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > > > > Looks like it requires more than trivial backport (I think). Tejun? > > > > > > The timer migration has changed quite a bit.  Given that we'v

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-23 Thread Thomas Gleixner
On Sat, 23 Jan 2016, Ben Hutchings wrote: > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > > > Looks like it requires more than trivial backport (I think). Tejun? > > > > The timer migration has changed quite a bit.  Given that we've never > > seen vmstat work crashing in 3.18 era, I wonder

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-22 Thread Ben Hutchings
On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote: > (cc'ing Thomas) > > On Thu, Jan 21, 2016 at 08:10:20PM -0500, Sasha Levin wrote: > > On 01/21/2016 04:52 AM, Jan Kara wrote: > > > On Wed 20-01-16 13:39:01, Shaohua Li wrote: > > > > On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: > >

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-22 Thread Tejun Heo
(cc'ing Thomas) On Thu, Jan 21, 2016 at 08:10:20PM -0500, Sasha Levin wrote: > On 01/21/2016 04:52 AM, Jan Kara wrote: > > On Wed 20-01-16 13:39:01, Shaohua Li wrote: > >> On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: > >>> Hello, > >>> > >>> a friend of mine started seeing crashes wit

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-21 Thread Sasha Levin
On 01/21/2016 04:52 AM, Jan Kara wrote: > On Wed 20-01-16 13:39:01, Shaohua Li wrote: >> On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: >>> Hello, >>> >>> a friend of mine started seeing crashes with 3.18.25 kernel - once >>> appropriate load is put on the machine it crashes within minut

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-21 Thread Sasha Levin
On 01/21/2016 04:52 AM, Jan Kara wrote: > On Wed 20-01-16 13:39:01, Shaohua Li wrote: >> > On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: >>> > > Hello, >>> > > >>> > > a friend of mine started seeing crashes with 3.18.25 kernel - once >>> > > appropriate load is put on the machine it c

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-21 Thread Jan Kara
On Wed 20-01-16 13:39:01, Shaohua Li wrote: > On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: > > Hello, > > > > a friend of mine started seeing crashes with 3.18.25 kernel - once > > appropriate load is put on the machine it crashes within minutes. He > > tracked down that reverting com

Re: Crashes with 874bbfe600a6 in 3.18.25

2016-01-20 Thread Shaohua Li
On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: > Hello, > > a friend of mine started seeing crashes with 3.18.25 kernel - once > appropriate load is put on the machine it crashes within minutes. He > tracked down that reverting commit 874bbfe600a6 (this is the commit ID from > Linus' tr

Crashes with 874bbfe600a6 in 3.18.25

2016-01-20 Thread Jan Kara
Hello, a friend of mine started seeing crashes with 3.18.25 kernel - once appropriate load is put on the machine it crashes within minutes. He tracked down that reverting commit 874bbfe600a6 (this is the commit ID from Linus' tree, in stable tree the commit ID is 1e7af294dd03) "workqueue: make sur