This article throws some light onto things: https://lwn.net/Articles/518953/
"Second, the greater the number of idle CPUs, the more work RCU must do when forcing quiescent states. Yes, the busier the system, the less work RCU needs to do! The reason for the extra work is that RCU is not permitted to disturb idle CPUs for energy-efficiency reasons. RCU must therefore probe a per-CPU data structure to read out idleness state during each grace period, likely incurring a cache miss on each such probe." Just to add, I running the VM with say 4 CPUs, all are which are idle. In my experiments on 3.19 and 4.2, kernels, kvm is not being used on the host, so we have QEMU emulating N CPUs with just 1 host CPU. Plus a loaded host means that this single CPU is busy and we have potentially large latencies serving the N virtual CPUs in the VM. I think that's part of the issue; large latencies from the host with a N-to-1 virt to host mapping meaning that we are tripping the RCU grace periods. To try and help RCU kthreads from suffering from delays, I added the following kernel parameters to the VM: rcu_nocb_poll rcutree.kthread_prio=90 rcuperf.verbose=1 I was able to run an 8 CPU VM without any RCU issues with the host CPU being hammered to death with stress-ng. I also then cranked down the RCU stall grace period to just 5 seconds to see how easy I can trip the issue with this more extreme setting using: echo 5 > /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout and again, no RCU issues. @Martin, can you try using the following kernel parameters on the VM and see if this helps: rcu_nocb_poll rcutree.kthread_prio=90 rcuperf.verbose=1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531768 Title: [arm64] lockups some time after booting Status in Auto Package Testing: Triaged Status in linux package in Ubuntu: Confirmed Bug description: I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I want to use for armhf autopkgtesting in LXD). I started with wily as that has lxd available (it's not yet available in trusty nor the PPA for arm64). However, pretty much any LXD task that I do (I haven't tried much else) on this machine takes unbearably long. A simple "lxc profile set default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes. I see tons of [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0 [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds. in dmesg (the attached apport info has the complete dmesg). ProblemType: Bug DistroRelease: Ubuntu 15.10 Package: linux-image-4.2.0-22-generic 4.2.0-22.27 ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6 Uname: Linux 4.2.0-22-generic aarch64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jan 7 09:18 seq crw-rw---- 1 root audio 116, 33 Jan 7 09:18 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.19.1-0ubuntu5 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A Date: Thu Jan 7 09:24:01 2016 IwConfig: eth0 no wireless extensions. lo no wireless extensions. lxcbr0 no wireless extensions. Lspci: 00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008] Subsystem: Red Hat, Inc Device [1af4:1100] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic root=LABEL=cloudimg-rootfs earlyprintk RelatedPackageVersions: linux-restricted-modules-4.2.0-22-generic N/A linux-backports-modules-4.2.0-22-generic N/A linux-firmware 1.149.3 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev' UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/auto-package-testing/+bug/1531768/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

