Dear Nicholas,

Thank you for the quick response.


Am 06.07.20 um 02:41 schrieb Nicholas Piggin:
Excerpts from Paul Menzel's message of July 5, 2020 8:30 pm:

Am 05.07.20 um 11:22 schrieb Paul Menzel:

With an IBM S822LC with Ubuntu 20.04, after updating to Firefox 78.0,
using Firefox seems to hang the system. This happened with self-built
Linux 5.7-rc5+ and now with 5.8-rc3+.

(At least I believe the Firefox update is causing this.)

Log in is impossible, and using the Serial over LAN over IPMI shows the
messages below.

[ 2620.579187] watchdog: BUG: soft lockup - CPU#125 stuck for 22s!
[swapper/125:0]
[ 2620.579378] Modules linked in: tcp_diag inet_diag unix_diag
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4
xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink
ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs
kvm_hv kvm joydev binfmt_misc uas usb_storage vmx_crypto ofpart
cmdlinepart bnx2x powernv_flash mtd mdio crct10dif_vpmsum at24
ibmpowernv ipmi_powernv ipmi_devintf powernv_rng ipmi_msghandler
opal_prd sch_fq_codel parport_pc nfsd ppdev lp auth_rpcgss nfs_acl
parport lockd grace sunrpc ip_tables x_tables autofs4 btrfs
blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds
mac_hid hid_generic ast drm_vram_helper drm_ttm_helper i2c_algo_bit
ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm
drm_panel_orientation_quirks ahci libahci usbhid hid crc32c_vpmsum
uio_pdrv_genirq uio
[ 2620.579537] CPU: 125 PID: 0 Comm: swapper/125 Tainted: G      D
W    L    5.8.0-rc3+ #1
[ 2620.579552] NIP:  c0000000010dad38 LR: c0000000010dad30 CTR:
c000000000237830
[ 2620.579568] REGS: c00000ffcb8c7600 TRAP: 0900   Tainted: G      D
W    L     (5.8.0-rc3+)
[ 2620.579582] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
44004228  XER: 00000000
[ 2620.579599] CFAR: c0000000010dad44 IRQMASK: 0 [ 2620.579599] GPR00:
c00000000023718c c00000ffcb8c7890 c000000001f9a900 0000000000000000 [
2620.579599] GPR04: c000000001fce438 0000000000000078 000000010008c1f2
0000000000000000 [ 2620.579599] GPR08: 000000ffd96a0000
0000000080000087 0000000000000000 c000000001fd25e0 [ 2620.579599]
GPR12: 0000000000004400 c00000ffff72f680 c000000001ea36d8
c00000ffcb859800 [ 2620.579599] GPR16: c00000000166c880
c0000000016f8e00 000000000000000a c00000ffcb859800 [ 2620.579599]
GPR20: 0000000000000100 c00000000166c918 c000000001fd21e8
c00000ffcb859800 [ 2620.579599] GPR24: 000000ffd96a0000
c000000001d44b80 c000000001d53780 0000000000000008 [ 2620.579599]
GPR28: c000000001fd21e0 0000000000000001 0000000000000000
c000000001d44b80 [ 2620.579711] NIP [c0000000010dad38]
_raw_spin_lock_irqsave+0x98/0x120
[ 2620.579724] LR [c0000000010dad30] _raw_spin_lock_irqsave+0x90/0x120
[ 2620.579737] Call Trace:
[ 2620.579746] [c00000ffcb8c7890] [c0000000013c84a0]
ncsi_ops+0x209f50/0x2dc1d8 (unreliable)
[ 2620.579763] [c00000ffcb8c78d0] [c00000000023718c] rcu_core+0xfc/0x7a0
[ 2620.579777] [c00000ffcb8c7970] [c0000000010db81c]
__do_softirq+0x17c/0x534
[ 2620.579791] [c00000ffcb8c7aa0] [c0000000001786f4] irq_exit+0xd4/0x130
[ 2620.579805] [c00000ffcb8c7ad0] [c000000000025eec]
timer_interrupt+0x13c/0x370
[ 2620.579821] [c00000ffcb8c7b40] [c0000000000165c0]
replay_soft_interrupts+0x320/0x3f0
[ 2620.579837] [c00000ffcb8c7d30] [c0000000000166d8]
arch_local_irq_restore+0x48/0xa0
[ 2620.579853] [c00000ffcb8c7d50] [c000000000de2fe0]
cpuidle_enter_state+0x100/0x780

[snip]

I have to warm reset the system to get it working again.

I am unable to reproduce this with Ubuntu’s Linux

Okay, not sure what that would be from, looks like RCU perhaps. Anyway
if it comes up again, let us know.

Ah, it’s a different trace. I think it’s just an effect of the first error (as below), as some CPUs lock up. I wasn’t able to capture the start of the trace above. In the attachment for the hang *below* you can also see

[ 664.705193] watchdog: BUG: soft lockup - CPU#134 stuck for 26s! [swapper/134:0]

after the first Oops.

With Linux 5.8-rc3+, I got now the beginning of the Linux messages.

[  572.253008] Oops: Exception in kernel mode, sig: 5 [#1]
[  572.253198] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[  572.253232] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM 
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle 
ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bridge stp llc overlay xfs kvm_hv kvm binfmt_misc joydev uas usb_storage 
vmx_crypto bnx2x crct10dif_vpmsum ofpart cmdlinepart powernv_flash mtd mdio 
ibmpowernv at24 ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng 
sch_fq_codel parport_pc ppdev lp nfsd parport auth_rpcgss nfs_acl lockd grace 
sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor 
zstd_compress raid6_pq input_leds mac_hid hid_generic ast drm_vram_helper 
drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt fb_sys_fops drm ahci drm_panel_orientation_quirks libahci usbhid hid 
crc32c_vpmsum uio_pdrv_genirq uio
[  572.253639] CPU: 4 PID: 6728 Comm: Web Content Not tainted 5.8.0-rc3+ #1
[  572.253659] NIP:  c00000000000ff5c LR: c00000000001a8f8 CTR: c0000000001d5f00
[  572.253835] REGS: c000007f31f0f420 TRAP: 1500   Not tainted  (5.8.0-rc3+)
[  572.253854] MSR:  900000000290b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 
28c48482  XER: 20000000
[  572.253888] CFAR: c00000000000fecc IRQMASK: 1
[  572.253888] GPR00: c00000000001b228 c000007f31f0f6b0 c000000001f9a900 
c000007f351544d0
[  572.253888] GPR04: 0000000000000000 c000007f31f0fe90 c000007f351544f0 
c000007f32e522b0
[  572.253888] GPR08: 0000000000000000 0000000000002000 9000000000009033 
c000007fbcd85800
[  572.253888] GPR12: 0000000000008800 c000007fffffb680 0000000000000005 
0000000000000004
[  572.253888] GPR16: c000007f35153800 c000007f35154130 0000000000000005 
0000000000000001
[  572.253888] GPR20: 0000000000000024 c000007f32e51e68 c000007f35154028 
0000007fd8da0000
[  572.253888] GPR24: 0000007fd8da0000 c000007f351544d0 c000007e9a4024d0 
c000000001665f18
[  572.253888] GPR28: c000007f351544d0 c000007f35153800 900000000290f033 
c000007f35153800
[  572.254079] NIP [c00000000000ff5c] save_fpu+0xa8/0x2ac
[  572.254098] LR [c00000000001a8f8] __giveup_fpu+0x28/0x80
[  572.254114] Call Trace:
[  572.254128] [c000007f31f0f6b0] [c000007f35153980] 0xc000007f35153980 
(unreliable)
[  572.254156] [c000007f31f0f6e0] [c00000000001b228] giveup_all+0x128/0x150
[  572.254327] [c000007f31f0f710] [c00000000001c124] __switch_to+0x104/0x490
[  572.254352] [c000007f31f0f770] [c0000000010d2e34] __schedule+0x2e4/0xa10
[  572.254374] [c000007f31f0f840] [c0000000010d35d4] schedule+0x74/0x140
[  572.254397] [c000007f31f0f870] [c0000000010d9478] 
schedule_timeout+0x358/0x5d0
[  572.254424] [c000007f31f0f980] [c0000000010d5638] 
wait_for_completion+0xc8/0x210
[  572.254451] [c000007f31f0fa00] [c000000000608ed4] do_coredump+0x3a4/0xd60
[  572.254625] [c000007f31f0fba0] [c00000000018d1cc] get_signal+0x1dc/0xd00
[  572.254648] [c000007f31f0fcc0] [c00000000001f088] 
do_notify_resume+0x158/0x450
[  572.254672] [c000007f31f0fda0] [c000000000037d04] 
interrupt_exit_user_prepare+0x1c4/0x230
[  572.254699] [c000007f31f0fe20] [c00000000000f2b4] interrupt_return+0x14/0x1c0
[  572.254720] Instruction dump:
[  572.254882] dae60170 db060180 db260190 db4601a0 db6601b0 db8601c0 dba601d0 
dbc601e0
[  572.254912] dbe601f0 48000204 38800000 f0000250 <7c062798> f0000250 38800010 
f0210a50
[  572.254946] ---[ end trace ba4452ee5c77d58e ]---

Please find all the messages attached.

"Oops: Exception in kernel mode, sig: 5 [#1]"

Unfortunately it's a very poor error message. I think it is a 0x1500
exception triggering in the kernel FP register saving. Do you have the
CONFIG_PPC_DENORMALISATION config option set?

Yes, as it’s set in the Ubuntu Linux kernel configuration, I have it set too.

    $ grep DENORMALI /boot/config-*
    /boot/config-4.15.0-23-generic:CONFIG_PPC_DENORMALISATION=y
    /boot/config-5.4.0-40-generic:CONFIG_PPC_DENORMALISATION=y
    /boot/config-5.7.0-rc5+:CONFIG_PPC_DENORMALISATION=y
    /boot/config-5.8.0-rc3+:CONFIG_PPC_DENORMALISATION=y


Kind regards,

Paul

Reply via email to