Collected more data points on this issue.

1. Tried offline CPUs. We found that the crash typically was on CPU:40,
so offlined CPU40 and repeated the test. The test seemed to make
progress but panicked on a different CPU. I tried to offline several
more CPU, but the crash seems to move on to other CPUs.

2. Changed the scheduler from CFQ to NOOP. This made no difference
either, crash was seen on CPU:44 and offline CPU44 yielded the same
results.

Panics seem to happen either in the scheduler or in ext4 code (note that
we are running stress on SDA). According to Cavium eng this could be a
due to a bad L2 cache or memory. Tailing /var/log/syslog and
/var/log/kernlog while the tests were running I did see messages like
this:

Jun 12 14:57:55 seuss ipmievd: Voltage sensor CPU_VTT_DDR02 Upper Non-critical 
going high Asserted (Reading 0.77 > Threshold 0.77 Volts)
Jun 12 14:57:56 seuss ipmievd: Voltage sensor CPU_VTT_DDR02 Upper Non-critical 
going high Deasserted (Reading 0.76 > Threshold 0.77 Volts)
Jun 12 14:57:57 seuss ipmievd: Voltage sensor CPU_VTT_DDR13 Upper Non-critical 
going high Deasserted (Reading 0.76 > Threshold 0.77 Volts)
Jun 12 14:57:58 seuss ipmievd: Voltage sensor CPU_VTT_DDR13 Upper Non-critical 
going high Asserted (Reading 0.77 > Threshold 0.77 Volts)

We have other CRB1S that function as expected and the stress-ng tests do
no cause any panics. I am tempted consider this issue to be a hardware
issue with this particular CRB1S

** Changed in: linux (Ubuntu Bionic)
       Status: Incomplete => Won't Fix

** Changed in: linux (Ubuntu Artful)
       Status: Confirmed => Won't Fix

** Changed in: linux (Ubuntu)
       Status: Incomplete => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1754053

Title:
  oops in set_next_entity / ipmi_msghandler

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1754053/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to