Package: src:linux Followup-For: Bug #883413 Hi Ben,
Unfortunately I can still reproduce this problem on 4.15-rc8 from experimental. The cmdline for this boot was: BOOT_IMAGE=/boot/vmlinuz-4.15.0-rc8-amd64 root=/dev/mapper/vg_tarquin-rootfs ro intel_iommu=on vsyscall=emulate scsi_mod.use_blk_mq=Y dm_mod.use_blk_mq=Y intel_pstate=passive i915.disable_display=Y i915.enable_gvt=Y apparmor=0 systemd.unified_cgroup_hierarchy=1 console=ttyS1,115200n8 console=tty0 This triggers with DefaultMemoryAccounting=yes enabled in /etc/systemd/system.conf, and NUT seems to regularly be involved in the crash on my system. Sadly the systemd unit is very simple indeed, and because my UPS is network-connected I'm not even doing dodgy things like USB from within NUT. Quite how the kernel thinks that nut-server.service is using 16 ZiB of memory is beyond me; presumably this is a slightly negative 64-bit int bring cast unsigned. The following also feels like a smoking gun: [ 2982.158622] percpu ref (css_release) <= 0 (-197) after switching to atomic The kernel log is: [ 2611.549862] WARNING: CPU: 0 PID: 20830 at /build/linux-b8fmzT/linux-4.15~rc8/mm/page_counter.c:27 page_counter_cancel+0x17/0x20 [ 2611.561360] Modules linked in: binfmt_misc fuse vhost_net vhost tap tun devlink bridge 8021q garp mrp stp llc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm ast irqbypass crct10dif_pclmul crc32_pclmul ttm drm_kms_helper ghash_clmulni_intel intel_cstate sg efi_pstore mei_me intel_uncore iTCO_wdt evdev iTCO_vendor_support intel_rapl_perf efivars pcspkr drm mei cdc_acm intel_pch_thermal shpchp joydev ie31200_edac video acpi_power_meter button acpi_pad nfsd nfs_acl lockd grace auth_rpcgss ipmi_si ipmi_devintf sunrpc ipmi_msghandler efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod ses enclosure scsi_transport_sas sd_mod hid_generic usbhid hid xhci_pci xhci_hcd ahci crc32c_intel ixgbe libahci igb i2c_algo_bit [ 2611.633015] aesni_intel aes_x86_64 dca ptp usbcore megaraid_sas crypto_simd libata cryptd glue_helper i2c_i801 pps_core usb_common mdio scsi_mod fan thermal [ 2611.647163] CPU: 0 PID: 20830 Comm: check_ups Not tainted 4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1 [ 2611.656338] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017 [ 2611.663857] RIP: 0010:page_counter_cancel+0x17/0x20 [ 2611.668765] RSP: 0018:ffffa74c8433fc70 EFLAGS: 00010097 [ 2611.674017] RAX: 0000000000000000 RBX: ffff8bc863c0b4c0 RCX: 0000000000000000 [ 2611.681186] RDX: 00003b83ba4109d0 RSI: 0000000000000001 RDI: ffff8bc863c0b4c0 [ 2611.688370] RBP: 0000000000000001 R08: ffff8bc8c50da8a0 R09: 0000000000000001 [ 2611.695556] R10: ffffa74c8433fd48 R11: 0000000001000000 R12: ffff8bc863c0b400 [ 2611.702740] R13: ffff8bc89c092800 R14: ffff8bc8a1270e10 R15: ffff8bc76955ec30 [ 2611.709924] FS: 00007f0669316fc0(0000) GS:ffff8bc8c5000000(0000) knlGS:0000000000000000 [ 2611.718063] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2611.723853] CR2: 00007f0668550930 CR3: 000000075ce30005 CR4: 00000000003626f0 [ 2611.731036] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2611.738218] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2611.745397] Call Trace: [ 2611.747881] page_counter_uncharge+0x1d/0x30 [ 2611.752195] drain_stock.isra.37+0x32/0xa0 [ 2611.756327] refill_stock+0x41/0x70 [ 2611.759855] __sk_mem_reduce_allocated+0x83/0xd0 [ 2611.764508] tcp_write_queue_purge+0x1a7/0x1d0 [ 2611.768990] tcp_v4_destroy_sock+0x3f/0x180 [ 2611.773208] tcp_v6_destroy_sock+0xe/0x20 [ 2611.777257] inet_csk_destroy_sock+0x47/0x100 [ 2611.781650] tcp_rcv_state_process+0x980/0xe20 [ 2611.786130] ? tcp_v6_do_rcv+0x1a7/0x3e0 [ 2611.790090] tcp_v6_do_rcv+0x1a7/0x3e0 [ 2611.793880] __release_sock+0x76/0xc0 [ 2611.797581] release_sock+0x2b/0x90 [ 2611.801107] tcp_close+0x165/0x3f0 [ 2611.804547] inet_release+0x36/0x60 [ 2611.808075] sock_release+0x1a/0x70 [ 2611.811601] sock_close+0xe/0x20 [ 2611.814861] __fput+0xd5/0x210 [ 2611.819465] task_work_run+0x84/0xa0 [ 2611.824577] exit_to_usermode_loop+0xb9/0xc0 [ 2611.830383] syscall_return_slowpath+0x88/0x90 [ 2611.836364] system_call_fast_compare_end+0x73/0x75 [ 2611.842741] RIP: 0033:0x7f0668ac8d84 [ 2611.847774] RSP: 002b:00007ffe23f9c7b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 2611.856787] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f0668ac8d84 [ 2611.865332] RDX: 0000000000001fff RSI: 00007ffe23f9c800 RDI: 0000000000000000 [ 2611.873833] RBP: 0000000000000006 R08: 0000000000000000 R09: 0000000000000000 [ 2611.882405] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe23f9e800 [ 2611.890813] R13: 00007ffe23f9c800 R14: 0000000000002000 R15: 0000000000000000 [ 2611.899185] Code: e8 39 b5 eb ff e9 49 ff ff ff 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f0 48 f7 d8 f0 48 0f c1 07 48 39 f0 78 02 f3 c3 <0f> ff c3 66 0f 1f 44 00 00 0f 1f 44 00 00 eb 19 48 89 f0 f0 48 [ 2611.920537] ---[ end trace 306225c4342d4340 ]--- [ 2981.898837] upsd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0 [ 2981.909192] upsd cpuset=/ mems_allowed=0 [ 2981.913519] CPU: 0 PID: 3295 Comm: upsd Tainted: G W 4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1 [ 2981.923783] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017 [ 2981.931647] Call Trace: [ 2981.934647] dump_stack+0x5c/0x85 [ 2981.938305] dump_header+0x6b/0x289 [ 2981.942379] oom_kill_process+0x228/0x430 [ 2981.947113] out_of_memory+0x2ab/0x4b0 [ 2981.951949] mem_cgroup_out_of_memory+0x49/0x80 [ 2981.957643] mem_cgroup_oom_synchronize+0x2ed/0x320 [ 2981.963664] ? get_mem_cgroup_from_mm+0x90/0x90 [ 2981.969334] pagefault_out_of_memory+0x32/0x77 [ 2981.974906] __do_page_fault+0x4a7/0x4e0 [ 2981.979879] ? page_fault+0x36/0x60 [ 2981.984384] page_fault+0x4c/0x60 [ 2981.988804] RIP: 0033:0x7f084e1fbca0 [ 2981.993483] RSP: 002b:00007ffd0a3bf0c8 EFLAGS: 00010202 [ 2981.993517] Task in /system.slice/nut-server.service killed as a result of limit of /system.slice/nut-server.service [ 2982.011567] memory: usage 18446744073709550932kB, limit 9007199254740988kB, failcnt 33 [ 2982.020705] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [ 2982.028532] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 [ 2982.035766] Memory cgroup stats for /system.slice/nut-server.service: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB [ 2982.059152] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 2982.069091] [ 3295] 114 3295 14245 0 131072 106 0 upsd [ 2982.078534] Memory cgroup out of memory: Kill process 3295 (upsd) score 0 or sacrifice child [ 2982.088491] Killed process 3295 (upsd) total-vm:56980kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 2982.099113] oom_reaper: reaped process 3295 (upsd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 2982.152146] ------------[ cut here ]------------ [ 2982.158622] percpu ref (css_release) <= 0 (-197) after switching to atomic [ 2982.158641] WARNING: CPU: 0 PID: 7 at /build/linux-b8fmzT/linux-4.15~rc8/lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0xf6/0x100 [ 2982.183896] Modules linked in: binfmt_misc fuse vhost_net vhost tap tun devlink bridge 8021q garp mrp stp llc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm ast irqbypass crct10dif_pclmul crc32_pclmul ttm drm_kms_helper ghash_clmulni_intel intel_cstate sg efi_pstore mei_me intel_uncore iTCO_wdt evdev iTCO_vendor_support intel_rapl_perf efivars pcspkr drm mei cdc_acm intel_pch_thermal shpchp joydev ie31200_edac video acpi_power_meter button acpi_pad nfsd nfs_acl lockd grace auth_rpcgss ipmi_si ipmi_devintf sunrpc ipmi_msghandler efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod ses enclosure scsi_transport_sas sd_mod hid_generic usbhid hid xhci_pci xhci_hcd ahci crc32c_intel ixgbe libahci igb i2c_algo_bit [ 2982.268964] aesni_intel aes_x86_64 dca ptp usbcore megaraid_sas crypto_simd libata cryptd glue_helper i2c_i801 pps_core usb_common mdio scsi_mod fan thermal [ 2982.287046] CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: G W 4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1 [ 2982.299367] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017 [ 2982.308886] RIP: 0010:percpu_ref_switch_to_atomic_rcu+0xf6/0x100 [ 2982.316899] RSP: 0018:ffffa74c831a7df8 EFLAGS: 00010282 [ 2982.324118] RAX: 0000000000000000 RBX: 7fffffffffffff3e RCX: ffffffffa064d748 [ 2982.333350] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000283 [ 2982.342431] RBP: ffff8bc863c0b438 R08: 0000000000000462 R09: ffffffffa0b98160 [ 2982.351496] R10: 0000000000000000 R11: 0000000000000000 R12: 00003b83b9e11040 [ 2982.360536] R13: ffffffffa071a5e0 R14: 7fffffffffffffff R15: 0000000000000202 [ 2982.369682] FS: 0000000000000000(0000) GS:ffff8bc8c5000000(0000) knlGS:0000000000000000 [ 2982.379801] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2982.387446] CR2: 00005592b84ecd50 CR3: 000000072c20a005 CR4: 00000000003626f0 [ 2982.396512] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2982.405282] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2982.414337] Call Trace: [ 2982.418723] rcu_process_callbacks+0x1af/0x4c0 [ 2982.425118] ? sort_range+0x20/0x20 [ 2982.430531] __do_softirq+0xd9/0x2a9 [ 2982.436043] ? sort_range+0x20/0x20 [ 2982.441441] run_ksoftirqd+0x25/0x40 [ 2982.446928] smpboot_thread_fn+0xdf/0x150 [ 2982.452865] kthread+0x111/0x130 [ 2982.458012] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2982.464998] ret_from_fork+0x32/0x40 [ 2982.470463] Code: 89 df ff 55 e8 eb c6 80 3d 86 15 d7 00 00 75 8a 48 8b 55 d8 48 8b 75 e8 48 c7 c7 28 18 45 a0 c6 05 6e 15 d7 00 01 e8 3a 62 cf ff <0f> ff e9 68 ff ff ff 0f 1f 00 41 54 55 49 89 f4 53 48 89 fb 48 [ 2982.493194] ---[ end trace 306225c4342d4341 ]--- Best regards, Chris -- System Information: Debian Release: buster/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'testing-debug'), (500, 'testing'), (100, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 4.15.0-rc8-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system)