On Tue, Feb 09, 2021 at 07:43:14AM +1100, Dave Chinner wrote:
> On Mon, Feb 08, 2021 at 09:28:24AM -0800, Darrick J. Wong wrote:
> > On Mon, Feb 09, 2021 at 09:11:40AM -0800, Paul E. McKenney wrote:
> > > On Mon, Feb 08, 2021 at 10:44:58AM -0500, Brian Foster wrote:
> > > > There was a v2 inline th
On Mon, Feb 08, 2021 at 09:28:24AM -0800, Darrick J. Wong wrote:
> On Mon, Feb 09, 2021 at 09:11:40AM -0800, Paul E. McKenney wrote:
> > On Mon, Feb 08, 2021 at 10:44:58AM -0500, Brian Foster wrote:
> > > There was a v2 inline that incorporated some directed feedback.
> > > Otherwise there were que
saw a self-detected stall on a CPU (October 27th, 2020,
> > > > > > > January 18th,
> > > > > > > 2021).
> > > > > > >
> > > > > > > Both times, the workqueue is `xfs-conv/md0 xfs_end_io`.
> > > > > > >
> > > &
2021).
> > > > > >
> > > > > > Both times, the workqueue is `xfs-conv/md0 xfs_end_io`.
> > > > > >
> > > > > > ```
> > > > > > [0.00] Linux version 5.4.57.mx64.340
> > > > > > (r...@theinternet.molgen.m
> > > > > ```
> > > > > [0.00] Linux version 5.4.57.mx64.340
> > > > > (r...@theinternet.molgen.mpg.de) (gcc version 7.5.0 (GCC)) #1 SMP Tue
> > > > > Aug 11
> > > > > 13:20:33 CEST 2020
> > > > > […]
> > >
> > 18th,
> > > > 2021).
> > > >
> > > > Both times, the workqueue is `xfs-conv/md0 xfs_end_io`.
> > > >
> > > > ```
> > > > [0.00] Linux version 5.4.57.mx64.340
> > > > (r...@theinternet.molgen.mpg.de) (gcc versi
-conv/md0 xfs_end_io`.
> > >
> > > ```
> > > [0.00] Linux version 5.4.57.mx64.340
> > > (r...@theinternet.molgen.mpg.de) (gcc version 7.5.0 (GCC)) #1 SMP Tue Aug
> > > 11
> > > 13:20:33 CEST 2020
> > > […]
> > > [48962.981257] rcu: IN
C)) #1 SMP Tue Aug 11
> > 13:20:33 CEST 2020
> > […]
> > [48962.981257] rcu: INFO: rcu_sched self-detected stall on CPU
> > [48962.987511] rcu: 4-: (20999 ticks this GP)
> > idle=fe6/1/0x4002 softirq=3630188/3630188 fqs=4696
> > [48962.998805]
e workqueue is `xfs-conv/md0 xfs_end_io`.
>
> ```
> [0.00] Linux version 5.4.57.mx64.340
> (r...@theinternet.molgen.mpg.de) (gcc version 7.5.0 (GCC)) #1 SMP Tue Aug 11
> 13:20:33 CEST 2020
> […]
> [48962.981257] rcu: INFO: rcu_sched self-detected stall on CPU
> [4
that point I would need to defer to the tracing folks.
Thanx, Paul
> [ 99.868127] rcu: INFO: rcu_sched self-detected stall on CPU
> [ 99.868127] rcu: 3-: (1 GPs behind)
> idle=d66/1/0x4000 softirq=2573/2600 fqs=3631
> [
/SQySbShzDnHK3CzpR1T7GA/kernel.config
[ 99.868127] rcu: INFO: rcu_sched self-detected stall on CPU
[ 99.868127] rcu: 3-: (1 GPs behind)
idle=d66/1/0x4000 softirq=2573/2600 fqs=3631
[ 99.868127] (t=21003 jiffies g=2909 q=4480)
[ 99.868127] NMI backtrace for cpu 3
[ 99.868127
suring. Maybe fewer rather than more bugs in
play here.
Thanx, Paul
> Enrico
> [34839.019680] INFO: rcu_sched self-detected stall on CPU
> [34839.019694] INFO: rcu_sched detected stalls on CPUs/tasks:
> [34839.019711]0-
On Wed, Jan 04, 2017 at 09:59:13PM +0100, Enrico Mioso wrote:
> Here is my .config: I send it 'cause I wasn't able to determine if I selected
> the right options.
> Sorry for this long config: I don't know how to represent those infos more
> efficiently.
You have RCU tracing configured, which is
Here is a new trace in the meanwhile: reporting it in case it proves useful.
Thank you very much for your help and patience.
Enrico
[34839.019680] INFO: rcu_sched self-detected stall on CPU
[34839.019694] INFO: rcu_sched detected stalls on CPUs/tasks:
[34839.019711] 0-...: (1 GPs behind) idle
Here is my .config: I send it 'cause I wasn't able to determine if I selected
the right options.
Sorry for this long config: I don't know how to represent those infos more
efficiently.
I would be very glad if you could send me some hints on how to perform ftracing
the right way. From past work
On Wed, Jan 04, 2017 at 09:16:31AM -0500, Steven Rostedt wrote:
> On Wed, 4 Jan 2017 05:46:08 -0800
> "Paul E. McKenney" wrote:
>
> > I suggest enabling tracing for timers with a goal of working out why
> > the rcu_sched task is not being awakened regularly -- during a grace
> > period, it should
Thank you very much guys. Can you giveme some guidances on how to actually use
ftrace? I'll recompile the kernel as needed and read the ftrace docs, but I
would like to be sure to be able to produce the needed data.
Thank you very much,
Enrico
On Wed, 4 Jan 2017 05:46:08 -0800
"Paul E. McKenney" wrote:
> I suggest enabling tracing for timers with a goal of working out why
> the rcu_sched task is not being awakened regularly -- during a grace
> period, it should be awakened every three jiffies or so (depending on
> the value of HZ).
Pe
27749.836023] cpuidle_enter+0xf/0x20
> [27749.836023] call_cpuidle+0x1c/0x30
> [27749.836023] do_idle+0xca/0x1a0
> [27749.836023] cpu_startup_entry+0x65/0x70
> [27749.836023] rest_init+0x5d/0x60
> [27749.836023] start_kernel+0x313/0x329
> [27749.836023] i386_start_kernel+0
On Wed, May 04, 2016 at 01:11:46AM +1000, Steven Haigh wrote:
> On 03/05/16 06:54, gre...@linuxfoundation.org wrote:
> > On Wed, Mar 30, 2016 at 05:04:28AM +1100, Steven Haigh wrote:
> >> Greg, please see below - this is probably more for you...
> >>
> >> On 03/29/2016 04:56 AM, Steven Haigh wrote:
On 03/05/16 06:54, gre...@linuxfoundation.org wrote:
> On Wed, Mar 30, 2016 at 05:04:28AM +1100, Steven Haigh wrote:
>> Greg, please see below - this is probably more for you...
>>
>> On 03/29/2016 04:56 AM, Steven Haigh wrote:
>>>
>>> Interestingly enough, this just happened again - but on a diffe
On Wed, Mar 30, 2016 at 05:04:28AM +1100, Steven Haigh wrote:
> Greg, please see below - this is probably more for you...
>
> On 03/29/2016 04:56 AM, Steven Haigh wrote:
> >
> > Interestingly enough, this just happened again - but on a different
> > virtual machine. I'm starting to wonder if this
On 30/03/2016 1:14 AM, Boris Ostrovsky wrote:
> On 03/29/2016 04:56 AM, Steven Haigh wrote:
>>
>> Interestingly enough, this just happened again - but on a different
>> virtual machine. I'm starting to wonder if this may have something to do
>> with the uptime of the machine - as the system that th
On 03/29/2016 02:04 PM, Steven Haigh wrote:
Greg, please see below - this is probably more for you...
On 03/29/2016 04:56 AM, Steven Haigh wrote:
Interestingly enough, this just happened again - but on a different
virtual machine. I'm starting to wonder if this may have something to do
with the
Greg, please see below - this is probably more for you...
On 03/29/2016 04:56 AM, Steven Haigh wrote:
>
> Interestingly enough, this just happened again - but on a different
> virtual machine. I'm starting to wonder if this may have something to do
> with the uptime of the machine - as the system
On 30/03/2016 1:14 AM, Boris Ostrovsky wrote:
> On 03/29/2016 04:56 AM, Steven Haigh wrote:
>>
>> Interestingly enough, this just happened again - but on a different
>> virtual machine. I'm starting to wonder if this may have something to do
>> with the uptime of the machine - as the system that th
On 03/29/2016 04:56 AM, Steven Haigh wrote:
Interestingly enough, this just happened again - but on a different
virtual machine. I'm starting to wonder if this may have something to do
with the uptime of the machine - as the system that this seems to happen
to is always different.
Destroying it
On 26/03/2016 8:07 AM, Steven Haigh wrote:
> On 26/03/2016 3:20 AM, Boris Ostrovsky wrote:
>> On 03/25/2016 12:04 PM, Steven Haigh wrote:
>>> It may not actually be the full logs. Once the system gets really upset,
>>> you can't run anything - as such, grabbing anything from dmesg is not
>>> possib
On 26/03/2016 3:20 AM, Boris Ostrovsky wrote:
> On 03/25/2016 12:04 PM, Steven Haigh wrote:
>> It may not actually be the full logs. Once the system gets really upset,
>> you can't run anything - as such, grabbing anything from dmesg is not
>> possible.
>>
>> The logs provided above is all that get
On 03/25/2016 12:04 PM, Steven Haigh wrote:
It may not actually be the full logs. Once the system gets really upset,
you can't run anything - as such, grabbing anything from dmesg is not
possible.
The logs provided above is all that gets spat out to the syslog server.
I'll try tinkering with a
and
>>>> eventually processes start segfaulting and dying. The only fix to
>>>> recover the system is to use 'xl destroy' to force-kill the VM and to
>>>> start it again.
>>>>
>>>> The majority of these issues seem to mention ext4
issue there - or may be a red herring.
The gritty details:
INFO: rcu_sched self-detected stall on CPU
#0110-...: (20999 ticks this GP) idle=327/141/0
softirq=1101493/1101493 fqs=6973
#011 (t=21000 jiffies g=827095 c=827094 q=524)
Task dump for CPU 0:
rsync R running task
t;
>> The majority of these issues seem to mention ext4 in the trace. This may
>> indicate an issue there - or may be a red herring.
>>
>> The gritty details:
>> INFO: rcu_sched self-detected stall on CPU
>> #0110-...: (20999 ticks this GP) idle=327/141/0
ing. The only fix to
recover the system is to use 'xl destroy' to force-kill the VM and to
start it again.
The majority of these issues seem to mention ext4 in the trace. This may
indicate an issue there - or may be a red herring.
The gritty details:
INFO: rcu_sched self-detected stall
use 'xl destroy' to force-kill the VM and to
start it again.
The majority of these issues seem to mention ext4 in the trace. This may
indicate an issue there - or may be a red herring.
The gritty details:
INFO: rcu_sched self-detected stall on CPU
#0110-...: (20999 ticks this GP) idle
On 10/04/2014 06:06 AM, Chuck Ebbert wrote:
On Fri, 03 Oct 2014 23:27:58 -0400
Waiman Long wrote:
On 10/03/2014 09:33 AM, Fengguang Wu wrote:
Hi Waiman,
FYI, we noticed the below changes on commit
bd01ec1a13f9a327950c8e3080096446c7804753 ("x86, locking/rwlocks: Enable qrwlocks on
x86")
+-
---++
> >
> >
> > run: /lkp/lkp/src/monitors/wrapper sched_debug {"interval"=>"10"}
> > run: /usr/bin/time -v -o /lkp/lkp/src/tmp/time /lkp/lkp/src/tests/wrapper
> > fsmark{"filesize"=>"9B", "
run: /usr/bin/time -v -o /lkp/lkp/src/tmp/time /lkp/lkp/src/tests/wrapper fsmark{"filesize"=>"9B", "test_size"=>"400M",
"sync_method"=>"fsyncBeforeClose", "nr_directories"=>"16d", "nr_files_per_
e task will be stopped
> on resource cpu and wait the destination cpu up. That hurt the
> performace. Let destination cpu do active balance will give task
>
>
> <3>[ 614.504149] INFO: rcu_sched self-detected stall on CPU { 3}
> (t=17 jiffies g=1455 c=1454 q=878
On Thu, 2014-02-06 at 13:19 +0100, Peter Zijlstra wrote:
> On Thu, Feb 06, 2014 at 12:08:54PM +, Bockholdt Arne wrote:
> > This on a a Intel Rangeley Silvermont Atom 8 core machine running kernel
> > 3.13.1/i386 as KVM host with several KVM guests. Tested with the same
> > configuration on ker
On Thu, Feb 06, 2014 at 12:08:54PM +, Bockholdt Arne wrote:
> Hi all,
>
> I've got the same problem with unpatched vanilla 3.13.x kernel on a KVM
> host. Here's a snippet from the dmesg output :
>
>
> [ 3928.132061] INFO: rcu_sched self-detected stall on
inux/sched.h| 15 +++
> > > kernel/cpu/idle.c | 17 ++++++---
> > > kernel/sched/core.c | 3 +--
> > > 5 files changed, 42 insertions(+), 10 deletions(-)
> > >
> > > [ 85.786775] INFO: rcu_sched self-det
lding
> >
> > arch/x86/include/asm/mwait.h | 2 +-
> > include/linux/preempt.h | 15 +++
> > include/linux/sched.h| 15 +++
> > kernel/cpu/idle.c| 17 ++---
> > kernel/sched/core.c | 3 +-
ux/sched.h| 15 +++
> kernel/cpu/idle.c| 17 ++---
> kernel/sched/core.c | 3 +--
> 5 files changed, 42 insertions(+), 10 deletions(-)
>
> [ 85.786775] INFO: rcu_sched self-detected stall on CPU { 1} (t=15000
> jif
sts passed
[ 23.340041] INFO: rcu_sched self-detected stall on CPU { 0} (t=2101 jiffies
g=4294967081 c=4294967080 q=41)
[ 23.340041] sending NMI to all CPUs:
[ 23.340041] NMI backtrace for cpu 0
[ 23.340041] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.12.0-rc6-01322-gf421436 #414
[ 23.34
On Mon, Apr 29, 2013 at 09:29:52AM -0400, Vivek Goyal wrote:
> On Mon, Apr 29, 2013 at 01:57:18AM -0700, Michel Lespinasse wrote:
> > On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins wrote:
> > > On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> > >> sys_brk() passes the length as the difference of two
On Mon, Apr 29, 2013 at 01:57:18AM -0700, Michel Lespinasse wrote:
> On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins wrote:
> > On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> >> sys_brk() passes the length as the difference of two page aligned
> >> addresses, so it's fine. But vm_brk() doesn't - i
On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins wrote:
> On Mon, 15 Apr 2013, Michel Lespinasse wrote:
>> sys_brk() passes the length as the difference of two page aligned addresses,
>> so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the
>> length, but then vm_brk passes t
On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> On Mon, Apr 15, 2013 at 2:47 PM, Hugh Dickins wrote:
> > --- 3.9-rc7/mm/mlock.c 2013-04-01 09:08:05.736012852 -0700
> > +++ linux/mm/mlock.c2013-04-15 14:20:24.454773245 -0700
> > @@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
> >
On Mon, Apr 15, 2013 at 2:47 PM, Hugh Dickins wrote:
> --- 3.9-rc7/mm/mlock.c 2013-04-01 09:08:05.736012852 -0700
> +++ linux/mm/mlock.c2013-04-15 14:20:24.454773245 -0700
> @@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
> long ret = 0;
>
> VM_BUG_ON(start & ~PAG
On Mon, 15 Apr 2013, Vivek Goyal wrote:
> On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> > >
> > > [..]
> > > > > My first guess would be that mmap_se
On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> CCing akpm.
>
> Vivek
>
> On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> >
> > [..]
> > > > My first guess would be that mmap_sem is held during exec, s
for populating new vmas
>
> I have locked down /sbin/kexec. And I get following traceback after a while.
>
> Thanks
> Vivek
>
> [ 93.130001] INFO: rcu_sched self-detected stall on CPU[ 93.131007]
> INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3,
> t=6
2d452c5707fe321208bcbcd
Author: Michel Lespinasse
Date: Fri Feb 22 16:32:37 2013 -0800
mm: introduce mm_populate() for populating new vmas
I have locked down /sbin/kexec. And I get following traceback after a while.
Thanks
Vivek
[ 93.130001] INFO: rcu_sched self-detected stall on CPU[ 93.1310
mory at exec() time.
> > My patches were working fine till 3.9-rc4 and suddendly things broke down
> > in 3.9-rc5.
> >
> > Whenever I tried to exec() a process with memory locked down, my bash
> > session hangs and after a while I get following warning.
> >
>
own
> in 3.9-rc5.
>
> Whenever I tried to exec() a process with memory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [ 174.669002] INFO: rcu_sched self-detected stall on CPU { 2}
> (t=6 jiffies g=2580 c=2579 q=1085)
> [ 17
emory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [ 174.669002] INFO: rcu_sched self-detected stall on CPU { 2}
> (t=6 jiffies g=2580 c=2579 q=1085)
> [ 174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> [
.
login: [ 174.669002] INFO: rcu_sched self-detected stall on CPU { 2} (t=6
jiffies g=2580 c=2579 q=1085)
[ 174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
[ 174.669002] Call Trace:
[ 174.669002][] rcu_check_callbacks+0x21a/0x760
[ 174.669002] [] ? acct_account_cputime+0x1c
58 matches
Mail list logo