Hi, Recently I encountered a kernel panic announcing "Unrecoverable FP Unavailable Exception 800 at c00000000009e308". I have attached the panic log at the end of the mail. As I known, this exception occured when the hard floating-point instruction was executed with FPU disabled, and if the instruction was from kernel space, kernel would assume it as unrecoverable and panic itself. *Here is the investigation I have done.* I checked the MSR firstly, and MSR[PR] = 0 and MSR[FP] = 0, It seems that the system did match the panic condition. Because MSR[PR] = 0, the instruction seemed come from kernel, but kernel would not do floating point calculation normally, so I was quite curious about the code which triggered the exception. And from the backtrace log, it should be the "update_min_vruntime" function. Unfortunately, I didn't see any floating-point operation in that function. Then I disassembled the vmlinux and found out the disassembly code of that function, and matched it with the instruction dump:
*c00000000009e2b8 <.update_min_vruntime>:...c00000000009e2d8: e9 1f 00 20 ld r8,32(r31)c00000000009e2dc: 2f a9 00 00 cmpdi cr7,r9,0c00000000009e2e0: 41 9e 00 68 beq cr7,c00000000009e348 <.update_min_vruntime+0x90>c00000000009e2e4: e9 5f 00 30 ld r10,48(r31)c00000000009e2e8: e9 29 00 50 ld r9,80(r9)c00000000009e2ec: 2f aa 00 00 cmpdi cr7,r10,0c00000000009e2f0: 41 9e 00 10 beq cr7,c00000000009e300 <.update_min_vruntime+0x48>c00000000009e2f4: e9 4a 00 40 ld r10,64(r10)c00000000009e2f8: 7c e9 50 51 subf. r7,r9,r10c00000000009e2fc: 41 80 00 24 blt c00000000009e320 <.update_min_vruntime+0x68>c00000000009e300: 7c e8 48 51 subf. r7,r8,r9c00000000009e304: 40 81 00 28 ble c00000000009e32c <.update_min_vruntime+0x74>c00000000009e308: f9 3f 00 20 std r9,32(r31)c00000000009e30c: 38 21 00 80 addi r1,r1,128c00000000009e310: e8 01 00 10 ld r0,16(r1)c00000000009e314: eb e1 ff f8 ld r31,-8(r1)* And the criminal instruction is *c00000000009e308: f9 3f 00 20 std r9,32(r31) * This is nothing to do with floating-point, I could not imagine why it will trigger the exception. Do you guys have any idea about this condition, appreciate for your reply. *Panic log* ... Linux version 4.1.21 (ryan@ubuntu) (gcc version 5.2.0) #22 SMP PREEMPT Wed Oct 28 10:04:32 CST 2020 ... <1>Kernel command line: ramdisk_size=0x700000 root=/dev/ram rw init=/init mem=3840M reserve=256M@3840M console=ttyS0,115200 crashkernel=128M@32M bportals=s1 qportals=s1 ... <0>linux-kernel-bde (16258): Allocating DMA memory using method dmaalloc=0 <0>linux-kernel-bde (16258): _use_dma_mapping:1 _dma_vbase:c000000060000000 _dma_pbase:60000000 _cpu_pbase:60000000 allocated:2000000 dmaalloc:0 <0>linux-kernel-bde (16247): _interrupt_connect d 0 <0>linux-kernel-bde (16247): connect primary isr <0>linux-kernel-bde (16247): _interrupt_connect(3514):device# = 0, irq_flags = 128, irq = 41 <1>device eth0.4092 entered promiscuous mode <1>Unrecoverable FP Unavailable Exception 800 at c00000000009e308 <0>Oops: Unrecoverable FP Unavailable Exception, sig: 6 [#1] <0>PREEMPT SMP NR_CPUS=4 CoreNet Generic <0>Modules linked in: linux_user_bde(PO) linux_kernel_bde(PO) dma2(O) dma(O) watchdog(O) ttyVS(O) gpiodev(O) lbdev(O) spid(O) block2mtd mpc85xx_edac edac_core sch_fq_codel uio_seville(O) loop [last unloaded: linux_kernel_bde] <1>CPU: 1 PID: 7 Comm: rcu_preempt Tainted: P O 4.1.21 #22 <1>task: c0000000e11a4680 ti: c0000000e11d8000 task.ti: c0000000e11d8000 <0>NIP: c00000000009e308 LR: c00000000009eda4 CTR: c0000000000a2de8 <0>REGS: c0000000e11db4d0 TRAP: 0800 Tainted: P O (4.1.21) <0>MSR: 0000000080029000 <CE,EE,ME> CR: 44a44242 XER: 00000000 <0>SOFTE: 0 <0>GPR00: c00000000009eda4 c0000000e11db750 c000000001763800 c0000000efe476a0 <0>GPR04: c0000000e11a4680 c0000000efe4fea0 c0000000efe47fa0 c000000001643800 <0>GPR08: 000006b94a32fd58 000006b949bb61f8 0000000000000000 c0000000e11f0000 <0>GPR12: 0000000044a44244 c00000000fffe6c0 0000000000000000 0000000000000000 <0>GPR16: c0000000016a9fa0 c0000000016aa108 00000000000000fa 0000000000000001 <0>GPR20: c00000000176d578 0000000000000000 0000000000000001 0000000000000000 <0>GPR24: 0000000000000001 c000000000b08a18 0000000000000000 c0000000efe47640 <0>NIP [c00000000009e308] .update_min_vruntime+0x50/0xa4 <0>LR [c00000000009eda4] .update_curr+0x80/0x1ec <0>Call Trace: <0>[c0000000e11db750] [c0000000e1004560] 0xc0000000e1004560 (unreliable) <0>[c0000000e11db7d0] [c00000000009eda4] .update_curr+0x80/0x1ec <0>[c0000000e11db870] [c0000000000a2e80] .dequeue_task_fair+0x98/0xaf0 <0>[c0000000e11db960] [c00000000009376c] .dequeue_task+0x68/0x88 <0>[c0000000e11db9f0] [c000000000ae8f88] .__schedule+0x2f4/0x7b4 <0>[c0000000e11dbaa0] [c000000000ae9484] .schedule+0x3c/0xa8 <0>[c0000000e11dbb20] [c000000000aecc98] .schedule_timeout+0x150/0x2d0 <0>[c0000000e11dbc00] [c0000000000cdbb0] .rcu_gp_kthread+0x6c4/0xad4 <0>[c0000000e11dbd30] [c000000000088aac] .kthread+0x10c/0x12c <0>[c0000000e11dbe30] [c0000000000009b0] .ret_from_kernel_thread+0x58/0xa8 <0>Instruction dump: <0>e91f0020 2fa90000 419e0068 e95f0030 e9290050 2faa0000 419e0010 e94a0040 <0>7ce95051 41800024 7ce84851 40810028 <f93f0020> 38210080 e8010010 ebe1fff8 <1>---[ end trace bc398b62ecbb6901 ]--- <0> <1>note: rcu_preempt[7] exited with preempt_count 2 Thanks, Ryan