Nathan Lynch <nath...@linux.ibm.com> writes: > Kip Warner <k...@thevertigo.com> writes: >> Dec 25 06:52:52 romulus-server kernel: [28835.277591] BUG: Unable to >> handle kernel data access on write at 0x132b47d38499fd58 >> Dec 25 06:52:52 romulus-server kernel: [28835.277624] Faulting >> instruction address: 0xc0000000004d0434 >> Dec 25 06:52:52 romulus-server kernel: [28835.277636] Oops: Kernel access >> of bad area, sig: 11 [#150] >> Dec 25 06:52:52 romulus-server kernel: [28835.277656] LE PAGE_SIZE=64K >> MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV >> Dec 25 06:52:52 romulus-server kernel: [28835.277669] Modules linked in: >> veth nft_masq zfs(PO) zunicode(PO) zzstd(O) zlua(O) zcommon(PO) znvpair(PO) >> zavl(PO) icp(PO) spl(O) vhost_vsock vmw_vsock_virtio_transport_common vhost >> vhost_iotlb vsock xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat >> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter xt_tcpudp nft_compat >> bridge stp llc nf_tables nfnetlink binfmt_misc dm_multipath scsi_dh_rdac >> scsi_dh_emc scsi_dh_alua joydev input_leds ipmi_powernv mac_hid ipmi_devintf >> ipmi_msghandler ofpart cmdlinepart at24 powernv_flash mtd uio_pdrv_genirq >> opal_prd uio ibmpowernv vmx_crypto sch_fq_codel jc42 ip_tables x_tables >> autofs4 xfs btrfs blake2b_generic raid10 raid456 async_raid6_recov >> async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq >> libcrc32c raid1 raid0 multipath linear nouveau ses enclosure >> scsi_transport_sas ast drm_vram_helper i2c_algo_bit drm_ttm_helper ttm >> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm >> crct10dif_vpmsum >> Dec 25 06:52:52 romulus-server kernel: [28835.277776] crc32c_vpmsum >> xhci_pci tg3 aacraid xhci_pci_renesas drm_panel_orientation_quirks >> Dec 25 06:52:52 romulus-server kernel: [28835.277918] CPU: 26 PID: 144937 >> Comm: postgres Tainted: P D O 5.11.0-41-generic #45-Ubuntu >> Dec 25 06:52:52 romulus-server kernel: [28835.277943] NIP: >> c0000000004d0434 LR: c0000000004d032c CTR: c0000000010a90e0 >> Dec 25 06:52:52 romulus-server kernel: [28835.277975] REGS: >> c000000056b9f6b0 TRAP: 0380 Tainted: P D O >> (5.11.0-41-generic) >> Dec 25 06:52:52 romulus-server kernel: [28835.278008] MSR: >> 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 88002281 XER: 0000008c >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] CFAR: >> c0000000004d041c IRQMASK: 0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR00: >> c0000000004d032c c000000056b9f950 c000000002409a00 0000000000000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR04: >> 0000000000400cc0 0000000000000097 ffffffffffffffff c000000ffda9d0d0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR08: >> 0000000ffbd90000 132b47d38499fce8 0000000000000070 d4ff277338704e25 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR12: >> 0000000000002000 c000000ffffd2c00 0000000000000000 c000000116c512d0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR16: >> 0000000000000154 c000000116c51570 c000000056b9fc88 0000000000000154 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR20: >> 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR24: >> c000000000ecccc0 0000000000000001 c0000000024588fc c000000000ec9954 >> Dec 25 06:52:52 romulus-server kernel: [28835.278050] GPR28: >> ffffffffffffffff c00000001d597e40 0000000000400cc0 c000000003018880 >> Dec 25 06:52:52 romulus-server kernel: [28835.278213] NIP >> [c0000000004d0434] kmem_cache_alloc_node+0x1d4/0x490 >> Dec 25 06:52:52 romulus-server kernel: [28835.278237] LR >> [c0000000004d032c] kmem_cache_alloc_node+0xcc/0x490 >> Dec 25 06:52:52 romulus-server kernel: [28835.278268] Call Trace: >> Dec 25 06:52:52 romulus-server kernel: [28835.278283] [c000000056b9f950] >> [c0000000004d032c] kmem_cache_alloc_node+0xcc/0x490 (unreliable) >> Dec 25 06:52:52 romulus-server kernel: [28835.278328] [c000000056b9f9c0] >> [c000000000ec9954] __alloc_skb+0x74/0x2d0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278369] [c000000056b9fa20] >> [c000000000ecccc0] alloc_skb_with_frags+0x70/0x2e0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278403] [c000000056b9faa0] >> [c000000000ec0f38] sock_alloc_send_pskb+0x1d8/0x200 >> Dec 25 06:52:52 romulus-server kernel: [28835.278436] [c000000056b9fb10] >> [c0000000010a93a8] unix_stream_sendmsg+0x2c8/0x710 >> Dec 25 06:52:52 romulus-server kernel: [28835.278471] [c000000056b9fc10] >> [c000000000eb64e0] sock_sendmsg+0x80/0xb0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278494] [c000000056b9fc40] >> [c000000000ebab88] __sys_sendto+0xf8/0x1a0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278526] [c000000056b9fd90] >> [c000000000ebaca0] sys_send+0x30/0x40 >> Dec 25 06:52:52 romulus-server kernel: [28835.278558] [c000000056b9fdb0] >> [c000000000036ffc] system_call_exception+0x10c/0x230 >> Dec 25 06:52:52 romulus-server kernel: [28835.278601] [c000000056b9fe10] >> [c00000000000d374] system_call_vectored_common+0xf4/0x26c >> Dec 25 06:52:52 romulus-server kernel: [28835.278634] --- interrupt: 3000 >> at 0x7ec638a194f4 >> Dec 25 06:52:52 romulus-server kernel: [28835.278654] NIP: >> 00007ec638a194f4 LR: 0000000000000000 CTR: 0000000000000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278685] REGS: >> c000000056b9fe80 TRAP: 3000 Tainted: P D O >> (5.11.0-41-generic) >> Dec 25 06:52:52 romulus-server kernel: [28835.278719] MSR: >> 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 48008281 XER: >> 00000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] IRQMASK: 0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR00: >> 000000000000014e 00007fffe99c1800 00007ec638a47f00 0000000000000009 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR04: >> 00000043809d1148 0000000000000154 0000000000000000 0000000000001ae8 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR08: >> 0000004362347d00 0000000000000000 0000000000000000 0000000000000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR12: >> 0000000000000000 00007ec6348e0890 0000000000000000 ffffffffffffffff >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR16: >> 0000000000000000 000000436233f7a0 0000000000000001 0000000000000000 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR20: >> 00007fffe99c18ac 0000004362344f48 0000000000000004 00007fffe99c18b0 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR24: >> 0000000006000001 0000000000000000 0000000000000154 00000043809d1148 >> Dec 25 06:52:52 romulus-server kernel: [28835.278766] GPR28: >> 0000000000000000 00007ec6348d9938 00000043809ceb00 000000000000000b >> Dec 25 06:52:52 romulus-server kernel: [28835.278992] NIP >> [00007ec638a194f4] 0x7ec638a194f4 >> Dec 25 06:52:52 romulus-server kernel: [28835.279020] LR >> [0000000000000000] 0x0 >> Dec 25 06:52:52 romulus-server kernel: [28835.279038] --- interrupt: 3000 >> Dec 25 06:52:52 romulus-server kernel: [28835.279054] Instruction dump: >> Dec 25 06:52:52 romulus-server kernel: [28835.279072] f9210020 41820098 >> 2e1cffff 3b200001 2c2a0000 41820088 41920010 894a0007 >> Dec 25 06:52:52 romulus-server kernel: [28835.279110] 7c1c5000 40820078 >> 815f0028 e97f00b8 <7ce9502a> 7c095214 886d0988 9b2d0988 >> Dec 25 06:52:52 romulus-server kernel: [28835.279141] ---[ end trace >> fe7ee98d0b7beb6a ]--- > > Perhaps slab corruption, but the 'D' taint flag (TAINT_DIE) means the > kernel oopsed at least once before this. Probably best to look at that > one first.
You also have the 'P' taint for a proprietary module loaded, so we (upstream) can't really help with that, you're better off reporting to your distro. If it's easily reproducible you could boot with slub_debug=FZP and see if that catches the slab corruption earlier, that might help us identify the actual problem. cheers