Public bug reported:
Jul 20 14:40:23 anonster kernel: [ 1716.692818] mlx5_core 0000:03:00.0:
assert_var[0] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.698541] mlx5_core 0000:03:00.0:
assert_var[1] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.704240] mlx5_core 0000:03:00.0:
assert_var[2] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.709945] mlx5_core 0000:03:00.0:
assert_var[3] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.715641] mlx5_core 0000:03:00.0:
assert_var[4] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.721343] mlx5_core 0000:03:00.0:
assert_exit_ptr 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.727214] mlx5_core 0000:03:00.0:
assert_callra 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.732917] mlx5_core 0000:03:00.0: fw_ver
65535.65535.65535
Jul 20 14:40:23 anonster kernel: [ 1716.738617] mlx5_core 0000:03:00.0: hw_id
0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.743620] mlx5_core 0000:03:00.0:
irisc_index 255
Jul 20 14:40:23 anonster kernel: [ 1716.748530] mlx5_core 0000:03:00.0: synd
0xff: unrecognized error
Jul 20 14:40:23 anonster kernel: [ 1716.754662] mlx5_core 0000:03:00.0:
ext_synd 0xffff
Jul 20 14:40:23 anonster kernel: [ 1716.759578] mlx5_core 0000:03:00.0: raw
fw_ver 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.765038] WARNING: CPU: 0 PID: 0 at
/build/linux-hwe-EPHQQp/linux-hwe-4.15.0/kernel/time/timer.c:898
mod_timer+0x3e4/0x400
Jul 20 14:40:23 anonster kernel: [ 1716.765039] Modules linked in: binfmt_misc
lkp_Ubuntu_4_15_0_142_146_generic_78(OEK) bonding nls_iso8859_1 xfs
edac_mce_amd ipmi_ssif kvm_amd hpilo kvm i
2c_piix4 irqbypass ipmi_si
Jul 20 14:40:23 anonster kernel: [ 1716.765051] mlx5_core 0000:03:00.0:
health_care:194:(pid 29045): handling bad device here
Jul 20 14:40:23 anonster kernel: [ 1716.765052] ipmi_devintf ipmi_msghandler
shpchp acpi_power_meter
Jul 20 14:40:23 anonster kernel: [ 1716.765057] mlx5_core 0000:03:00.0:
mlx5_handle_bad_state:152:(pid 29045): Expected to see disabled NIC but it is
has invalid value 3
Jul 20 14:40:23 anonster kernel: [ 1716.765058] k10temp mac_hid ib_iser
Jul 20 14:40:23 anonster kernel: [ 1716.765062] mlx5_core 0000:03:00.0:
mlx5_pci_err_detected was called
Jul 20 14:40:23 anonster kernel: [ 1716.765063] rdma_cm iw_cm ib_cm
Jul 20 14:40:23 anonster kernel: [ 1716.765067] mlx5_core 0000:03:00.0:
mlx5_enter_error_state:121:(pid 29045): start
Jul 20 14:40:23 anonster kernel: [ 1716.765067] ib_core iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi autofs4 btrfs zstd_compress raid10 raid456
async_raid6_recov async_memcpy async
_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
bcache ses enclosure crct10dif_pclmul crc32_pclmul mgag200 ghash_clmulni_intel
pcbc ttm drm_kms_helper aesni_intel
mlx5_core syscopyarea sysfillrect igb sysimgblt aes_x86_64 fb_sys_fops
crypto_simd glue_helper mlxfw dca nvme cryptd drm devlink i2c_algo_bit smartpqi
nvme_core ptp scsi_transport_sas pps_
core wmi
Jul 20 14:40:23 anonster kernel: [ 1716.772598] CPU: 0 PID: 0 Comm: swapper/0
Tainted: G OE K 4.15.0-142-generic #146~16.04.1-Ubuntu
Jul 20 14:40:23 anonster kernel: [ 1716.772598] Hardware name: HPE ProLiant
DL325 Gen10 Plus/ProLiant DL325 Gen10 Plus, BIOS A43 05/11/2020
Jul 20 14:40:23 anonster kernel: [ 1716.772600] RIP: 0010:mod_timer+0x3e4/0x400
Jul 20 14:40:23 anonster kernel: [ 1716.772601] RSP: 0018:ffff91e55e603e30
EFLAGS: 00010093
Jul 20 14:40:23 anonster kernel: [ 1716.772603] RAX: 0000000100056792 RBX:
00000001000567c4 RCX: 000000010005678a
Jul 20 14:40:23 anonster kernel: [ 1716.772603] RDX: 000000010005678c RSI:
ffff91e55e603e48 RDI: ffff91e55e61a700
Jul 20 14:40:23 anonster kernel: [ 1716.772604] RBP: ffff91e55e603e80 R08:
ffff91e55e010800 R09: ffff91e55dc01ff0
Jul 20 14:40:23 anonster kernel: [ 1716.772605] R10: 0000000000000000 R11:
0000000000000040 R12: ffff91e54bb4d8d8
Jul 20 14:40:23 anonster kernel: [ 1716.772606] R13: ffff91e54bb4d8d8 R14:
ffff91e55e61a700 R15: ffff91e54bb4d8d8
Jul 20 14:40:23 anonster kernel: [ 1716.772607] FS: 0000000000000000(0000)
GS:ffff91e55e600000(0000) knlGS:0000000000000000
Jul 20 14:40:23 anonster kernel: [ 1716.772607] CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
Jul 20 14:40:23 anonster kernel: [ 1716.772608] CR2: 00007fd20bd2e000 CR3:
0000000816294000 CR4: 0000000000340ef0
Jul 20 14:40:23 anonster kernel: [ 1716.772609] Call Trace:
Jul 20 14:40:23 anonster kernel: [ 1716.772611] <IRQ>
Jul 20 14:40:23 anonster kernel: [ 1716.772617] ?
fbcon_add_cursor_timer+0xc0/0xc0
Jul 20 14:40:23 anonster kernel: [ 1716.772620] cursor_timer_handler+0x45/0x50
Jul 20 14:40:23 anonster kernel: [ 1716.772622] mlx5_core 0000:03:00.0:
mlx5_enter_error_state:128:(pid 29045): end
Jul 20 14:40:23 anonster kernel: [ 1716.779975] call_timer_fn+0x32/0x140
Jul 20 14:40:23 anonster kernel: [ 1716.779976] run_timer_softirq+0x1e9/0x430
Jul 20 14:40:23 anonster kernel: [ 1716.779978] ? ktime_get+0x3e/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.779981] ? lapic_next_event+0x20/0x30
Jul 20 14:40:23 anonster kernel: [ 1716.779985] __do_softirq+0xf5/0x2a8
Jul 20 14:40:23 anonster kernel: [ 1716.779988] irq_exit+0xca/0xd0
Jul 20 14:40:23 anonster kernel: [ 1716.779989]
smp_apic_timer_interrupt+0x79/0x150
Jul 20 14:40:23 anonster kernel: [ 1716.779990] apic_timer_interrupt+0x90/0xa0
Jul 20 14:40:23 anonster kernel: [ 1716.779991] </IRQ>
Jul 20 14:40:23 anonster kernel: [ 1716.779994] RIP:
0010:cpuidle_enter_state+0xa7/0x300
Jul 20 14:40:23 anonster kernel: [ 1716.779995] RSP: 0018:ffffffff9c803e08
EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
Jul 20 14:40:23 anonster kernel: [ 1716.779996] RAX: ffff91e55e621900 RBX:
0000000000000002 RCX: 000000000000001f
Jul 20 14:40:23 anonster kernel: [ 1716.779997] RDX: 0000000000000000 RSI:
0000000028133c6f RDI: 0000000000000000
Jul 20 14:40:23 anonster kernel: [ 1716.779997] RBP: ffffffff9c803e40 R08:
ffffffe48aae298f R09: 0000000000000008
Jul 20 14:40:23 anonster kernel: [ 1716.779998] R10: ffffffff9c803dd8 R11:
0000000000002c8b R12: 0000000000000002
Jul 20 14:40:23 anonster kernel: [ 1716.779998] R13: ffff91e54d043800 R14:
ffffffff9c981c98 R15: 0000018fb282ae03
Jul 20 14:40:23 anonster kernel: [ 1716.780000] ?
cpuidle_enter_state+0x96/0x300
Jul 20 14:40:23 anonster kernel: [ 1716.780002] cpuidle_enter+0x17/0x20
Jul 20 14:40:23 anonster kernel: [ 1716.780004] call_cpuidle+0x23/0x40
Jul 20 14:40:23 anonster kernel: [ 1716.780006] do_idle+0x197/0x200
Jul 20 14:40:23 anonster kernel: [ 1716.780007] cpu_startup_entry+0x73/0x80
Jul 20 14:40:23 anonster kernel: [ 1716.780010] rest_init+0xaa/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.780013] start_kernel+0x4fa/0x51e
Jul 20 14:40:23 anonster kernel: [ 1716.780015]
x86_64_start_reservations+0x24/0x26
Jul 20 14:40:23 anonster kernel: [ 1716.780016] x86_64_start_kernel+0x74/0x77
Jul 20 14:40:23 anonster kernel: [ 1716.780019] secondary_startup_64+0xa5/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.780020] Code: b1 fc ff ff 49 89 46 10
48 89 45 c0 e9 a4 fc ff ff 0f 0b 45 8b 7c 24 20 e9 5d fd ff ff 49 89 55 10 45
8b 7c 24 20 e9 4f fd ff ff <0f> 0b e9 a4 fc ff ff 49 89 46 10 e9 9b fc ff ff e8
97 f9 f7 ff
Jul 20 14:40:23 anonster kernel: [ 1716.780035] ---[ end trace 3e92c45954bacae0
]---
Jul 20 14:40:24 anonster kernel: [ 1717.204835] mlx5_core 0000:03:00.1:
assert_var[0] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.210539] mlx5_core 0000:03:00.1:
assert_var[1] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.216242] mlx5_core 0000:03:00.1:
assert_var[2] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.221940] mlx5_core 0000:03:00.1:
assert_var[3] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.227645] mlx5_core 0000:03:00.1:
assert_var[4] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.233342] mlx5_core 0000:03:00.1:
assert_exit_ptr 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.239218] mlx5_core 0000:03:00.1:
assert_callra 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.244917] mlx5_core 0000:03:00.1: fw_ver
65535.65535.65535
Jul 20 14:40:24 anonster kernel: [ 1717.250617] mlx5_core 0000:03:00.1: hw_id
0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.255615] mlx5_core 0000:03:00.1:
irisc_index 255
Jul 20 14:40:24 anonster kernel: [ 1717.260533] mlx5_core 0000:03:00.1: synd
0xff: unrecognized error
Jul 20 14:40:24 anonster kernel: [ 1717.266666] mlx5_core 0000:03:00.1:
ext_synd 0xffff
Jul 20 14:40:24 anonster kernel: [ 1717.271584] mlx5_core 0000:03:00.1: raw
fw_ver 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.277053] mlx5_core 0000:03:00.1:
health_care:194:(pid 16512): handling bad device here
Jul 20 14:40:24 anonster kernel: [ 1717.277057] mlx5_core 0000:03:00.1:
mlx5_handle_bad_state:152:(pid 16512): Expected to see disabled NIC but it is
has invalid value 3
Jul 20 14:40:24 anonster kernel: [ 1717.277060] mlx5_core 0000:03:00.1:
mlx5_pci_err_detected was called
Jul 20 14:40:24 anonster kernel: [ 1717.277063] mlx5_core 0000:03:00.1:
mlx5_enter_error_state:121:(pid 16512): start
Jul 20 14:40:24 anonster kernel: [ 1717.284625] mlx5_core 0000:03:00.1:
mlx5_enter_error_state:128:(pid 16512): end
Jul 20 14:40:24 anonster kernel: [ 1717.300353] mlx5_core 0000:03:00.0:
mlx5_wait_for_vf_pages:576:(pid 29045): Skipping wait for vf pages stage
Jul 20 14:40:24 anonster kernel: [ 1717.321544] mlx5_core 0000:03:00.0 ens2f0:
mlx5e_get_link_ksettings: query port ptys failed: -5
Jul 20 14:40:24 anonster kernel: [ 1717.330315] mlx5_core 0000:03:00.0 ens2f0:
speed changed to 0 for port ens2f0
Jul 20 14:40:24 anonster kernel: [ 1717.337814] mlx5_core 0000:03:00.1 ens2f1:
mlx5e_get_link_ksettings: query port ptys failed: -5
Jul 20 14:40:24 anonster kernel: [ 1717.346576] mlx5_core 0000:03:00.1 ens2f1:
speed changed to 0 for port ens2f1
Jul 20 14:40:24 anonster kernel: [ 1717.354089] mlx5_core 0000:03:00.1:
mlx5_wait_for_vf_pages:576:(pid 16512): Skipping wait for vf pages stage
Jul 20 14:40:24 anonster kernel: [ 1717.360907] bond0: link status definitely
down for interface ens2f0, disabling it
Jul 20 14:40:24 anonster kernel: [ 1717.360946] bond0: link status definitely
down for interface ens2f1, disabling it
Jul 20 14:41:25 anonster kernel: [ 1778.646176] mlx5_core 0000:03:00.0: health
recovery flow aborted since the nic state is invalid
Jul 20 14:41:25 anonster kernel: [ 1778.646180] mlx5_core 0000:03:00.1: health
recovery flow aborted since the nic state is invalid
== ApportVersion =================================
2.20.1-0ubuntu2.30
== Architecture =================================
amd64
== Date =================================
Tue Jul 20 16:52:44 2021
== Dependencies =================================
adduser 3.113+nmu3ubuntu4
apt 1.2.35
apt-utils 1.2.35
busybox-initramfs 1:1.22.0-15ubuntu1.4
coreutils 8.25-2ubuntu3~16.04
cpio 2.11+dfsg-5ubuntu1.1
debconf 1.5.58ubuntu2
debconf-i18n 1.5.58ubuntu2
debianutils 4.7
dpkg 1.18.4ubuntu1.7+ppa1 [origin: LP-PPA-canonical-is-sa-launchpad]
e2fslibs 1.42.13-1ubuntu1.2
e2fsprogs 1.42.13-1ubuntu1.2
gcc-5-base 5.4.0-6ubuntu1~16.04.12
gcc-6-base 6.0.1-0ubuntu1
gnupg 1.4.20-1ubuntu3.3
gpgv 1.4.20-1ubuntu3.3
init-system-helpers 1.29ubuntu4
initramfs-tools 0.122ubuntu8.17
initramfs-tools-bin 0.122ubuntu8.17
initramfs-tools-core 0.122ubuntu8.17
initscripts 2.88dsf-59.3ubuntu2
insserv 1.14.0-5ubuntu3
klibc-utils 2.0.4-8ubuntu1.16.04.4
kmod 22-1ubuntu5.2
libacl1 2.2.52-3
libapt-inst2.0 1.2.35
libapt-pkg5.0 1.2.35
libattr1 1:2.4.47-2
libaudit-common 1:2.4.5-1ubuntu2.1
libaudit1 1:2.4.5-1ubuntu2.1
libblkid1 2.27.1-6ubuntu3.10
libbz2-1.0 1.0.6-8ubuntu0.2
libc6 2.23-0ubuntu11.3
libcomerr2 1.42.13-1ubuntu1.2
libdb5.3 5.3.28-11ubuntu0.2
libfdisk1 2.27.1-6ubuntu3.10
libgcc1 1:6.0.1-0ubuntu1
libgcrypt20 1.6.5-2ubuntu0.6
libgpg-error0 1.21-2ubuntu1
libgpm2 1.20.4-6.1
libklibc 2.0.4-8ubuntu1.16.04.4
libkmod2 22-1ubuntu5.2
liblocale-gettext-perl 1.07-1build1
liblz4-1 0.0~r131-2ubuntu2
liblzma5 5.1.1alpha+20120614-2ubuntu2
libmount1 2.27.1-6ubuntu3.10
libncurses5 6.0+20160213-1ubuntu1
libncursesw5 6.0+20160213-1ubuntu1
libpam-modules 1.1.8-3.2ubuntu2.3
libpam-modules-bin 1.1.8-3.2ubuntu2.3
libpam0g 1.1.8-3.2ubuntu2.3
libpcre3 2:8.38-3.1
libprocps4 2:3.3.10-4ubuntu2.5
libreadline6 6.3-8ubuntu2
libselinux1 2.4-3build2
libsemanage-common 2.3-1build3
libsemanage1 2.3-1build3
libsepol1 2.4-2
libsmartcols1 2.27.1-6ubuntu3.10
libss2 1.42.13-1ubuntu1.2
libstdc++6 5.4.0-6ubuntu1~16.04.12
libsystemd0 229-4ubuntu21.31
libtext-charwidth-perl 0.04-7build5
libtext-iconv-perl 1.7-5build4
libtext-wrapi18n-perl 0.06-7.1
libtinfo5 6.0+20160213-1ubuntu1
libudev1 229-4ubuntu21.31
libusb-0.1-4 2:0.1.12-28
libustr-1.0-1 1.0.4-5
libuuid1 2.27.1-6ubuntu3.10
libzstd1 1.3.1+dfsg-1~ubuntu0.16.04.1
linux-base 4.5ubuntu1.2~16.04.1
linux-modules-4.15.0-142-generic 4.15.0-142.146~16.04.1
lsb-base 9.20160110ubuntu0.2
mount 2.27.1-6ubuntu3.10
multiarch-support 2.23-0ubuntu11.3
passwd 1:4.2-3.1ubuntu5.4
perl-base 5.22.1-9ubuntu0.9
procps 2:3.3.10-4ubuntu2.5
psmisc 22.21-2.1ubuntu0.1
readline-common 6.3-8ubuntu2
sensible-utils 0.0.9ubuntu0.16.04.1
sysv-rc 2.88dsf-59.3ubuntu2
sysvinit-utils 2.88dsf-59.3ubuntu2
tar 1.28-2.1ubuntu0.2
ubuntu-keyring 2012.05.19.1
udev 229-4ubuntu21.31
util-linux 2.27.1-6ubuntu3.10
uuid-runtime 2.27.1-6ubuntu3.10
zlib1g 1:1.2.8.dfsg-2ubuntu4.3
== DistroRelease =================================
Ubuntu 16.04
== NonfreeKernelModules =================================
lkp_Ubuntu_4_15_0_142_146_generic_78
== Package =================================
linux-image-4.15.0-142-generic 4.15.0-142.146~16.04.1
== PackageArchitecture =================================
amd64
== ProblemType =================================
Bug
== ProcCpuinfoMinimal =================================
processor : 15
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7262 8-Core Processor
stepping : 0
microcode : 0x8301038
cpu MHz : 1795.684
cache size : 512 KB
physical id : 0
siblings : 16
core id : 28
cpu cores : 8
apicid : 57
initial apicid : 57
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid
aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes
xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a
misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core
perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd ibrs ibpb
stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt
clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total
cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save
tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic
v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 6387.44
TLB size : 3072 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
== ProcEnviron =================================
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
== ProcVersionSignature =================================
Ubuntu 4.15.0-142.146~16.04.1-generic 4.15.18
== SourcePackage =================================
linux-signed-hwe
== Tags =================================
xenial third-party-packages
== Uname =================================
Linux 4.15.0-142-generic x86_64
== UpgradeStatus =================================
No upgrade log present (probably fresh install)
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1936958
Title:
mlx5_core crash, taking down a bond
Status in linux package in Ubuntu:
New
Bug description:
Jul 20 14:40:23 anonster kernel: [ 1716.692818] mlx5_core 0000:03:00.0:
assert_var[0] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.698541] mlx5_core 0000:03:00.0:
assert_var[1] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.704240] mlx5_core 0000:03:00.0:
assert_var[2] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.709945] mlx5_core 0000:03:00.0:
assert_var[3] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.715641] mlx5_core 0000:03:00.0:
assert_var[4] 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.721343] mlx5_core 0000:03:00.0:
assert_exit_ptr 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.727214] mlx5_core 0000:03:00.0:
assert_callra 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.732917] mlx5_core 0000:03:00.0:
fw_ver 65535.65535.65535
Jul 20 14:40:23 anonster kernel: [ 1716.738617] mlx5_core 0000:03:00.0: hw_id
0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.743620] mlx5_core 0000:03:00.0:
irisc_index 255
Jul 20 14:40:23 anonster kernel: [ 1716.748530] mlx5_core 0000:03:00.0: synd
0xff: unrecognized error
Jul 20 14:40:23 anonster kernel: [ 1716.754662] mlx5_core 0000:03:00.0:
ext_synd 0xffff
Jul 20 14:40:23 anonster kernel: [ 1716.759578] mlx5_core 0000:03:00.0: raw
fw_ver 0xffffffff
Jul 20 14:40:23 anonster kernel: [ 1716.765038] WARNING: CPU: 0 PID: 0 at
/build/linux-hwe-EPHQQp/linux-hwe-4.15.0/kernel/time/timer.c:898
mod_timer+0x3e4/0x400
Jul 20 14:40:23 anonster kernel: [ 1716.765039] Modules linked in:
binfmt_misc lkp_Ubuntu_4_15_0_142_146_generic_78(OEK) bonding nls_iso8859_1 xfs
edac_mce_amd ipmi_ssif kvm_amd hpilo kvm i
2c_piix4 irqbypass ipmi_si
Jul 20 14:40:23 anonster kernel: [ 1716.765051] mlx5_core 0000:03:00.0:
health_care:194:(pid 29045): handling bad device here
Jul 20 14:40:23 anonster kernel: [ 1716.765052] ipmi_devintf ipmi_msghandler
shpchp acpi_power_meter
Jul 20 14:40:23 anonster kernel: [ 1716.765057] mlx5_core 0000:03:00.0:
mlx5_handle_bad_state:152:(pid 29045): Expected to see disabled NIC but it is
has invalid value 3
Jul 20 14:40:23 anonster kernel: [ 1716.765058] k10temp mac_hid ib_iser
Jul 20 14:40:23 anonster kernel: [ 1716.765062] mlx5_core 0000:03:00.0:
mlx5_pci_err_detected was called
Jul 20 14:40:23 anonster kernel: [ 1716.765063] rdma_cm iw_cm ib_cm
Jul 20 14:40:23 anonster kernel: [ 1716.765067] mlx5_core 0000:03:00.0:
mlx5_enter_error_state:121:(pid 29045): start
Jul 20 14:40:23 anonster kernel: [ 1716.765067] ib_core iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs zstd_compress raid10
raid456 async_raid6_recov async_memcpy async
_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
bcache ses enclosure crct10dif_pclmul crc32_pclmul mgag200 ghash_clmulni_intel
pcbc ttm drm_kms_helper aesni_intel
mlx5_core syscopyarea sysfillrect igb sysimgblt aes_x86_64 fb_sys_fops
crypto_simd glue_helper mlxfw dca nvme cryptd drm devlink i2c_algo_bit smartpqi
nvme_core ptp scsi_transport_sas pps_
core wmi
Jul 20 14:40:23 anonster kernel: [ 1716.772598] CPU: 0 PID: 0 Comm: swapper/0
Tainted: G OE K 4.15.0-142-generic #146~16.04.1-Ubuntu
Jul 20 14:40:23 anonster kernel: [ 1716.772598] Hardware name: HPE ProLiant
DL325 Gen10 Plus/ProLiant DL325 Gen10 Plus, BIOS A43 05/11/2020
Jul 20 14:40:23 anonster kernel: [ 1716.772600] RIP:
0010:mod_timer+0x3e4/0x400
Jul 20 14:40:23 anonster kernel: [ 1716.772601] RSP: 0018:ffff91e55e603e30
EFLAGS: 00010093
Jul 20 14:40:23 anonster kernel: [ 1716.772603] RAX: 0000000100056792 RBX:
00000001000567c4 RCX: 000000010005678a
Jul 20 14:40:23 anonster kernel: [ 1716.772603] RDX: 000000010005678c RSI:
ffff91e55e603e48 RDI: ffff91e55e61a700
Jul 20 14:40:23 anonster kernel: [ 1716.772604] RBP: ffff91e55e603e80 R08:
ffff91e55e010800 R09: ffff91e55dc01ff0
Jul 20 14:40:23 anonster kernel: [ 1716.772605] R10: 0000000000000000 R11:
0000000000000040 R12: ffff91e54bb4d8d8
Jul 20 14:40:23 anonster kernel: [ 1716.772606] R13: ffff91e54bb4d8d8 R14:
ffff91e55e61a700 R15: ffff91e54bb4d8d8
Jul 20 14:40:23 anonster kernel: [ 1716.772607] FS: 0000000000000000(0000)
GS:ffff91e55e600000(0000) knlGS:0000000000000000
Jul 20 14:40:23 anonster kernel: [ 1716.772607] CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
Jul 20 14:40:23 anonster kernel: [ 1716.772608] CR2: 00007fd20bd2e000 CR3:
0000000816294000 CR4: 0000000000340ef0
Jul 20 14:40:23 anonster kernel: [ 1716.772609] Call Trace:
Jul 20 14:40:23 anonster kernel: [ 1716.772611] <IRQ>
Jul 20 14:40:23 anonster kernel: [ 1716.772617] ?
fbcon_add_cursor_timer+0xc0/0xc0
Jul 20 14:40:23 anonster kernel: [ 1716.772620]
cursor_timer_handler+0x45/0x50
Jul 20 14:40:23 anonster kernel: [ 1716.772622] mlx5_core 0000:03:00.0:
mlx5_enter_error_state:128:(pid 29045): end
Jul 20 14:40:23 anonster kernel: [ 1716.779975] call_timer_fn+0x32/0x140
Jul 20 14:40:23 anonster kernel: [ 1716.779976] run_timer_softirq+0x1e9/0x430
Jul 20 14:40:23 anonster kernel: [ 1716.779978] ? ktime_get+0x3e/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.779981] ? lapic_next_event+0x20/0x30
Jul 20 14:40:23 anonster kernel: [ 1716.779985] __do_softirq+0xf5/0x2a8
Jul 20 14:40:23 anonster kernel: [ 1716.779988] irq_exit+0xca/0xd0
Jul 20 14:40:23 anonster kernel: [ 1716.779989]
smp_apic_timer_interrupt+0x79/0x150
Jul 20 14:40:23 anonster kernel: [ 1716.779990]
apic_timer_interrupt+0x90/0xa0
Jul 20 14:40:23 anonster kernel: [ 1716.779991] </IRQ>
Jul 20 14:40:23 anonster kernel: [ 1716.779994] RIP:
0010:cpuidle_enter_state+0xa7/0x300
Jul 20 14:40:23 anonster kernel: [ 1716.779995] RSP: 0018:ffffffff9c803e08
EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
Jul 20 14:40:23 anonster kernel: [ 1716.779996] RAX: ffff91e55e621900 RBX:
0000000000000002 RCX: 000000000000001f
Jul 20 14:40:23 anonster kernel: [ 1716.779997] RDX: 0000000000000000 RSI:
0000000028133c6f RDI: 0000000000000000
Jul 20 14:40:23 anonster kernel: [ 1716.779997] RBP: ffffffff9c803e40 R08:
ffffffe48aae298f R09: 0000000000000008
Jul 20 14:40:23 anonster kernel: [ 1716.779998] R10: ffffffff9c803dd8 R11:
0000000000002c8b R12: 0000000000000002
Jul 20 14:40:23 anonster kernel: [ 1716.779998] R13: ffff91e54d043800 R14:
ffffffff9c981c98 R15: 0000018fb282ae03
Jul 20 14:40:23 anonster kernel: [ 1716.780000] ?
cpuidle_enter_state+0x96/0x300
Jul 20 14:40:23 anonster kernel: [ 1716.780002] cpuidle_enter+0x17/0x20
Jul 20 14:40:23 anonster kernel: [ 1716.780004] call_cpuidle+0x23/0x40
Jul 20 14:40:23 anonster kernel: [ 1716.780006] do_idle+0x197/0x200
Jul 20 14:40:23 anonster kernel: [ 1716.780007] cpu_startup_entry+0x73/0x80
Jul 20 14:40:23 anonster kernel: [ 1716.780010] rest_init+0xaa/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.780013] start_kernel+0x4fa/0x51e
Jul 20 14:40:23 anonster kernel: [ 1716.780015]
x86_64_start_reservations+0x24/0x26
Jul 20 14:40:23 anonster kernel: [ 1716.780016] x86_64_start_kernel+0x74/0x77
Jul 20 14:40:23 anonster kernel: [ 1716.780019]
secondary_startup_64+0xa5/0xb0
Jul 20 14:40:23 anonster kernel: [ 1716.780020] Code: b1 fc ff ff 49 89 46 10
48 89 45 c0 e9 a4 fc ff ff 0f 0b 45 8b 7c 24 20 e9 5d fd ff ff 49 89 55 10 45
8b 7c 24 20 e9 4f fd ff ff <0f> 0b e9 a4 fc ff ff 49 89 46 10 e9 9b fc ff ff e8
97 f9 f7 ff
Jul 20 14:40:23 anonster kernel: [ 1716.780035] ---[ end trace
3e92c45954bacae0 ]---
Jul 20 14:40:24 anonster kernel: [ 1717.204835] mlx5_core 0000:03:00.1:
assert_var[0] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.210539] mlx5_core 0000:03:00.1:
assert_var[1] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.216242] mlx5_core 0000:03:00.1:
assert_var[2] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.221940] mlx5_core 0000:03:00.1:
assert_var[3] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.227645] mlx5_core 0000:03:00.1:
assert_var[4] 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.233342] mlx5_core 0000:03:00.1:
assert_exit_ptr 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.239218] mlx5_core 0000:03:00.1:
assert_callra 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.244917] mlx5_core 0000:03:00.1:
fw_ver 65535.65535.65535
Jul 20 14:40:24 anonster kernel: [ 1717.250617] mlx5_core 0000:03:00.1: hw_id
0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.255615] mlx5_core 0000:03:00.1:
irisc_index 255
Jul 20 14:40:24 anonster kernel: [ 1717.260533] mlx5_core 0000:03:00.1: synd
0xff: unrecognized error
Jul 20 14:40:24 anonster kernel: [ 1717.266666] mlx5_core 0000:03:00.1:
ext_synd 0xffff
Jul 20 14:40:24 anonster kernel: [ 1717.271584] mlx5_core 0000:03:00.1: raw
fw_ver 0xffffffff
Jul 20 14:40:24 anonster kernel: [ 1717.277053] mlx5_core 0000:03:00.1:
health_care:194:(pid 16512): handling bad device here
Jul 20 14:40:24 anonster kernel: [ 1717.277057] mlx5_core 0000:03:00.1:
mlx5_handle_bad_state:152:(pid 16512): Expected to see disabled NIC but it is
has invalid value 3
Jul 20 14:40:24 anonster kernel: [ 1717.277060] mlx5_core 0000:03:00.1:
mlx5_pci_err_detected was called
Jul 20 14:40:24 anonster kernel: [ 1717.277063] mlx5_core 0000:03:00.1:
mlx5_enter_error_state:121:(pid 16512): start
Jul 20 14:40:24 anonster kernel: [ 1717.284625] mlx5_core 0000:03:00.1:
mlx5_enter_error_state:128:(pid 16512): end
Jul 20 14:40:24 anonster kernel: [ 1717.300353] mlx5_core 0000:03:00.0:
mlx5_wait_for_vf_pages:576:(pid 29045): Skipping wait for vf pages stage
Jul 20 14:40:24 anonster kernel: [ 1717.321544] mlx5_core 0000:03:00.0
ens2f0: mlx5e_get_link_ksettings: query port ptys failed: -5
Jul 20 14:40:24 anonster kernel: [ 1717.330315] mlx5_core 0000:03:00.0
ens2f0: speed changed to 0 for port ens2f0
Jul 20 14:40:24 anonster kernel: [ 1717.337814] mlx5_core 0000:03:00.1
ens2f1: mlx5e_get_link_ksettings: query port ptys failed: -5
Jul 20 14:40:24 anonster kernel: [ 1717.346576] mlx5_core 0000:03:00.1
ens2f1: speed changed to 0 for port ens2f1
Jul 20 14:40:24 anonster kernel: [ 1717.354089] mlx5_core 0000:03:00.1:
mlx5_wait_for_vf_pages:576:(pid 16512): Skipping wait for vf pages stage
Jul 20 14:40:24 anonster kernel: [ 1717.360907] bond0: link status definitely
down for interface ens2f0, disabling it
Jul 20 14:40:24 anonster kernel: [ 1717.360946] bond0: link status definitely
down for interface ens2f1, disabling it
Jul 20 14:41:25 anonster kernel: [ 1778.646176] mlx5_core 0000:03:00.0:
health recovery flow aborted since the nic state is invalid
Jul 20 14:41:25 anonster kernel: [ 1778.646180] mlx5_core 0000:03:00.1:
health recovery flow aborted since the nic state is invalid
== ApportVersion =================================
2.20.1-0ubuntu2.30
== Architecture =================================
amd64
== Date =================================
Tue Jul 20 16:52:44 2021
== Dependencies =================================
adduser 3.113+nmu3ubuntu4
apt 1.2.35
apt-utils 1.2.35
busybox-initramfs 1:1.22.0-15ubuntu1.4
coreutils 8.25-2ubuntu3~16.04
cpio 2.11+dfsg-5ubuntu1.1
debconf 1.5.58ubuntu2
debconf-i18n 1.5.58ubuntu2
debianutils 4.7
dpkg 1.18.4ubuntu1.7+ppa1 [origin: LP-PPA-canonical-is-sa-launchpad]
e2fslibs 1.42.13-1ubuntu1.2
e2fsprogs 1.42.13-1ubuntu1.2
gcc-5-base 5.4.0-6ubuntu1~16.04.12
gcc-6-base 6.0.1-0ubuntu1
gnupg 1.4.20-1ubuntu3.3
gpgv 1.4.20-1ubuntu3.3
init-system-helpers 1.29ubuntu4
initramfs-tools 0.122ubuntu8.17
initramfs-tools-bin 0.122ubuntu8.17
initramfs-tools-core 0.122ubuntu8.17
initscripts 2.88dsf-59.3ubuntu2
insserv 1.14.0-5ubuntu3
klibc-utils 2.0.4-8ubuntu1.16.04.4
kmod 22-1ubuntu5.2
libacl1 2.2.52-3
libapt-inst2.0 1.2.35
libapt-pkg5.0 1.2.35
libattr1 1:2.4.47-2
libaudit-common 1:2.4.5-1ubuntu2.1
libaudit1 1:2.4.5-1ubuntu2.1
libblkid1 2.27.1-6ubuntu3.10
libbz2-1.0 1.0.6-8ubuntu0.2
libc6 2.23-0ubuntu11.3
libcomerr2 1.42.13-1ubuntu1.2
libdb5.3 5.3.28-11ubuntu0.2
libfdisk1 2.27.1-6ubuntu3.10
libgcc1 1:6.0.1-0ubuntu1
libgcrypt20 1.6.5-2ubuntu0.6
libgpg-error0 1.21-2ubuntu1
libgpm2 1.20.4-6.1
libklibc 2.0.4-8ubuntu1.16.04.4
libkmod2 22-1ubuntu5.2
liblocale-gettext-perl 1.07-1build1
liblz4-1 0.0~r131-2ubuntu2
liblzma5 5.1.1alpha+20120614-2ubuntu2
libmount1 2.27.1-6ubuntu3.10
libncurses5 6.0+20160213-1ubuntu1
libncursesw5 6.0+20160213-1ubuntu1
libpam-modules 1.1.8-3.2ubuntu2.3
libpam-modules-bin 1.1.8-3.2ubuntu2.3
libpam0g 1.1.8-3.2ubuntu2.3
libpcre3 2:8.38-3.1
libprocps4 2:3.3.10-4ubuntu2.5
libreadline6 6.3-8ubuntu2
libselinux1 2.4-3build2
libsemanage-common 2.3-1build3
libsemanage1 2.3-1build3
libsepol1 2.4-2
libsmartcols1 2.27.1-6ubuntu3.10
libss2 1.42.13-1ubuntu1.2
libstdc++6 5.4.0-6ubuntu1~16.04.12
libsystemd0 229-4ubuntu21.31
libtext-charwidth-perl 0.04-7build5
libtext-iconv-perl 1.7-5build4
libtext-wrapi18n-perl 0.06-7.1
libtinfo5 6.0+20160213-1ubuntu1
libudev1 229-4ubuntu21.31
libusb-0.1-4 2:0.1.12-28
libustr-1.0-1 1.0.4-5
libuuid1 2.27.1-6ubuntu3.10
libzstd1 1.3.1+dfsg-1~ubuntu0.16.04.1
linux-base 4.5ubuntu1.2~16.04.1
linux-modules-4.15.0-142-generic 4.15.0-142.146~16.04.1
lsb-base 9.20160110ubuntu0.2
mount 2.27.1-6ubuntu3.10
multiarch-support 2.23-0ubuntu11.3
passwd 1:4.2-3.1ubuntu5.4
perl-base 5.22.1-9ubuntu0.9
procps 2:3.3.10-4ubuntu2.5
psmisc 22.21-2.1ubuntu0.1
readline-common 6.3-8ubuntu2
sensible-utils 0.0.9ubuntu0.16.04.1
sysv-rc 2.88dsf-59.3ubuntu2
sysvinit-utils 2.88dsf-59.3ubuntu2
tar 1.28-2.1ubuntu0.2
ubuntu-keyring 2012.05.19.1
udev 229-4ubuntu21.31
util-linux 2.27.1-6ubuntu3.10
uuid-runtime 2.27.1-6ubuntu3.10
zlib1g 1:1.2.8.dfsg-2ubuntu4.3
== DistroRelease =================================
Ubuntu 16.04
== NonfreeKernelModules =================================
lkp_Ubuntu_4_15_0_142_146_generic_78
== Package =================================
linux-image-4.15.0-142-generic 4.15.0-142.146~16.04.1
== PackageArchitecture =================================
amd64
== ProblemType =================================
Bug
== ProcCpuinfoMinimal =================================
processor : 15
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7262 8-Core Processor
stepping : 0
microcode : 0x8301038
cpu MHz : 1795.684
cache size : 512 KB
physical id : 0
siblings : 16
core id : 28
cpu cores : 8
apicid : 57
initial apicid : 57
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid
aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes
xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a
misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core
perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd ibrs ibpb
stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt
clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total
cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save
tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic
v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 6387.44
TLB size : 3072 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
== ProcEnviron =================================
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
== ProcVersionSignature =================================
Ubuntu 4.15.0-142.146~16.04.1-generic 4.15.18
== SourcePackage =================================
linux-signed-hwe
== Tags =================================
xenial third-party-packages
== Uname =================================
Linux 4.15.0-142-generic x86_64
== UpgradeStatus =================================
No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1936958/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp