Public bug reported:
Running the ADT tests on a power box, the bpf tests crash the kernel as
follows:
[ 2745.079592] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 2745.079808] Faulting instruction address: 0x00000000
[ 2745.079824] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2745.079993] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 2745.080011] Modules linked in: af_packet_diag tcp_diag udp_diag raw_diag
inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev
input_leds mac_hid ofpart
cmdlinepart powernv_flash mtd ibmpowernv at24 uio_pdrv_genirq uio ipmi_powernv
ipmi_devintf ipmi_msghandler opal_prd powernv_rng vmx_crypto sch_fq_codel
ip_tables x_tables autofs4 bt
rfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid hid ast drm_vram_he
lper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm tg3 ahci libahci
drm_panel_orientation_quirks [last unloaded: no
tifier_error_inject]
[ 2745.080195] CPU: 0 PID: 1111366 Comm: reuseport_bpf_c Not tainted
5.4.0-7-generic #8
[ 2745.080214] NIP: 0000000000000000 LR: c000000000ce8710 CTR: 0000000000000000
[ 2745.080233] REGS: c0000007ff6eb550 TRAP: 0400 Not tainted
(5.4.0-7-generic)
[ 2745.080250] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002282
XER: 20000000
[ 2745.080272] CFAR: c00000000000de44 IRQMASK: 0
[ 2745.080272] GPR00: c000000000d67c9c c0000007ff6eb7e0 c000000001a5bf00
c0000004258e10e0
[ 2745.080272] GPR04: c008000002830038 c0000004258e10e0 0000000000000028
000000000000e3c2
[ 2745.080272] GPR08: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 2745.080272] GPR12: 0000000000000000 c000000001cf0000 0000000000000000
0000000000000001
[ 2745.080272] GPR16: 00000000000022b8 000000000100007f 000000000000e3c2
000000000100007f
[ 2745.080272] GPR20: c00000000198c100 0000000000000000 0000000000000000
00000000000022b8
[ 2745.080272] GPR24: 0000000000000000 0000000000000028 0000000000000080
000000000100007f
[ 2745.080272] GPR28: c008000002830000 0000000018ed5e01 c0000004258e10e0
c00000075f0ff000
[ 2745.080409] NIP [0000000000000000] 0x0
[ 2745.080423] LR [c000000000ce8710] reuseport_select_sock+0x100/0x400
[ 2745.080439] Call Trace:
[ 2745.080448] [c0000007ff6eb7e0] [c0000007ff6eb8a0] 0xc0000007ff6eb8a0
(unreliable)
[ 2745.080469] [c0000007ff6eb880] [c000000000d67c9c]
inet_lhash2_lookup+0x1ec/0x220
[ 2745.080490] [c0000007ff6eb900] [c000000000d6849c]
__inet_lookup_listener+0x1ec/0x1f0
[ 2745.080509] [c0000007ff6eb9d0] [c000000000d96608] tcp_v4_rcv+0x6e8/0xe70
[ 2745.080527] [c0000007ff6ebb00] [c000000000d5a480]
ip_protocol_deliver_rcu+0x60/0x2b0
[ 2745.080547] [c0000007ff6ebb50] [c000000000d5a740]
ip_local_deliver_finish+0x70/0x90
[ 2745.080566] [c0000007ff6ebb70] [c000000000d5a7ec] ip_local_deliver+0x8c/0x140
[ 2745.080585] [c0000007ff6ebbe0] [c000000000d59aec] ip_rcv_finish+0xbc/0xf0
[ 2745.080602] [c0000007ff6ebc20] [c000000000d5a9a0] ip_rcv+0x100/0x110
[ 2745.080619] [c0000007ff6ebca0] [c000000000cab220]
__netif_receive_skb_one_core+0x70/0xb0
[ 2745.080638] [c0000007ff6ebce0] [c000000000cac4f0] process_backlog+0xd0/0x230
[ 2745.080657] [c0000007ff6ebd50] [c000000000cadc68] net_rx_action+0x1e8/0x520
[ 2745.080674] [c0000007ff6ebe70] [c000000000ee2a7c] __do_softirq+0x15c/0x3b8
[ 2745.080692] [c0000007ff6ebf90] [c000000000030678] call_do_softirq+0x14/0x24
[ 2745.080709] [c00000070656f7c0] [c00000000001bf58]
do_softirq_own_stack+0x38/0x50
[ 2745.080729] [c00000070656f7e0] [c000000000143d60] do_softirq.part.0+0x80/0xb0
[ 2745.080914] [c00000070656f810] [c000000000143e54]
__local_bh_enable_ip+0xc4/0xf0
[ 2745.080933] [c00000070656f830] [c000000000d5f8fc]
ip_finish_output2+0x1fc/0x740
[ 2745.080953] [c00000070656f8d0] [c000000000d61fe4] ip_output+0xd4/0x190
[ 2745.080971] [c00000070656f960] [c000000000d61444] ip_local_out+0x64/0x90
[ 2745.080988] [c00000070656f9a0] [c000000000d61838] __ip_queue_xmit+0x168/0x4d0
[ 2745.081007] [c00000070656fa30] [c000000000d90a3c] ip_queue_xmit+0x1c/0x30
[ 2745.081024] [c00000070656fa50] [c000000000d887e4]
__tcp_transmit_skb+0x574/0xda0
[ 2745.081044] [c00000070656fb00] [c000000000d89a88] tcp_connect+0x4b8/0x600
[ 2745.081060] [c00000070656fbb0] [c000000000d93148] tcp_v4_connect+0x478/0x5b0
[ 2745.082755] [c00000070656fc40] [c000000000db876c]
__inet_stream_connect+0x12c/0x4c0
[ 2745.084563] [c00000070656fcf0] [c000000000db8b5c]
inet_stream_connect+0x5c/0x90
[ 2745.085528] [c00000070656fd30] [c000000000c75dec] __sys_connect+0x11c/0x160
[ 2745.086424] [c00000070656fe00] [c000000000c75e58] sys_connect+0x28/0x40
[ 2745.087343] [c00000070656fe20] [c00000000000b278] system_call+0x5c/0x68
[ 2745.089157] Instruction dump:
[ 2745.089169] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX
[ 2745.090048] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX
[ 2745.096394] ---[ end trace d347ca85a257c66f ]---
[ 2745.208020]
[ 2746.208219] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 274[ 2796.226294116,5] OPAL: Reboot request...
6.316857] Rebooting in 10 seconds..
The final ADT test output recorded was:
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
17:03:13 DEBUG| [stdout] # Socket 2: 2
17:03:13 DEBUG| [stdout] # Socket 3: 3
... etc ...
17:03:13 DEBUG| [stdout] # Socket 4: 4
17:03:13 DEBUG| [stdout] # Socket 5: 5
17:03:13 DEBUG| [stdout] # Socket 9: 19
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Socket 3: 18
17:03:13 DEBUG| [stdout] # Socket 4: 19
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Socket 4: 19
17:03:13 DEBUG| [stdout] # Testing too many filters...
17:03:13 DEBUG| [stdout] # Testing filters on non-SO_REUSEPORT socket...
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP w/ mapped IPv4 ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing filter add without bind...
17:03:13 DEBUG| [stdout] # SUCCESS
17:03:13 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
17:03:13 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
17:03:13 DEBUG| [stdout] # ---- IPv4 UDP ----
17:03:13 DEBUG| [stdout] # send cpu 0, receive socket 0
17:03:13 DEBUG| [stdout] # send cpu 1, receive socket 1
...
17:03:13 DEBUG| [stdout] # send cpu 125, receive socket 125
17:03:13 DEBUG| [stdout] # send cpu 127, receive socket 127
17:03:13 DEBUG| [stdout] # ---- IPv4 TCP ----
[ end of output as machine panic's ]
..so it occurred sometime around or after this. I'll re-run this with
the ipmi tool on the console to see if I can see how far it got before
the kernel panic'd.
** Affects: linux (Ubuntu)
Importance: High
Assignee: Colin Ian King (colin-king)
Status: In Progress
** Changed in: linux (Ubuntu)
Assignee: (unassigned) => Colin Ian King (colin-king)
** Changed in: linux (Ubuntu)
Status: New => In Progress
** Changed in: linux (Ubuntu)
Importance: Undecided => High
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1855151
Title:
adt bpf tests crash 5.4.0-7 on ppc64el on power box
Status in linux package in Ubuntu:
In Progress
Bug description:
Running the ADT tests on a power box, the bpf tests crash the kernel
as follows:
[ 2745.079592] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 2745.079808] Faulting instruction address: 0x00000000
[ 2745.079824] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2745.079993] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 2745.080011] Modules linked in: af_packet_diag tcp_diag udp_diag raw_diag
inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev
input_leds mac_hid ofpart
cmdlinepart powernv_flash mtd ibmpowernv at24 uio_pdrv_genirq uio
ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng vmx_crypto
sch_fq_codel ip_tables x_tables autofs4 bt
rfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid hid ast drm_vram_he
lper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm tg3 ahci libahci
drm_panel_orientation_quirks [last unloaded: no
tifier_error_inject]
[ 2745.080195] CPU: 0 PID: 1111366 Comm: reuseport_bpf_c Not tainted
5.4.0-7-generic #8
[ 2745.080214] NIP: 0000000000000000 LR: c000000000ce8710 CTR:
0000000000000000
[ 2745.080233] REGS: c0000007ff6eb550 TRAP: 0400 Not tainted
(5.4.0-7-generic)
[ 2745.080250] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002282
XER: 20000000
[ 2745.080272] CFAR: c00000000000de44 IRQMASK: 0
[ 2745.080272] GPR00: c000000000d67c9c c0000007ff6eb7e0 c000000001a5bf00
c0000004258e10e0
[ 2745.080272] GPR04: c008000002830038 c0000004258e10e0 0000000000000028
000000000000e3c2
[ 2745.080272] GPR08: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 2745.080272] GPR12: 0000000000000000 c000000001cf0000 0000000000000000
0000000000000001
[ 2745.080272] GPR16: 00000000000022b8 000000000100007f 000000000000e3c2
000000000100007f
[ 2745.080272] GPR20: c00000000198c100 0000000000000000 0000000000000000
00000000000022b8
[ 2745.080272] GPR24: 0000000000000000 0000000000000028 0000000000000080
000000000100007f
[ 2745.080272] GPR28: c008000002830000 0000000018ed5e01 c0000004258e10e0
c00000075f0ff000
[ 2745.080409] NIP [0000000000000000] 0x0
[ 2745.080423] LR [c000000000ce8710] reuseport_select_sock+0x100/0x400
[ 2745.080439] Call Trace:
[ 2745.080448] [c0000007ff6eb7e0] [c0000007ff6eb8a0] 0xc0000007ff6eb8a0
(unreliable)
[ 2745.080469] [c0000007ff6eb880] [c000000000d67c9c]
inet_lhash2_lookup+0x1ec/0x220
[ 2745.080490] [c0000007ff6eb900] [c000000000d6849c]
__inet_lookup_listener+0x1ec/0x1f0
[ 2745.080509] [c0000007ff6eb9d0] [c000000000d96608] tcp_v4_rcv+0x6e8/0xe70
[ 2745.080527] [c0000007ff6ebb00] [c000000000d5a480]
ip_protocol_deliver_rcu+0x60/0x2b0
[ 2745.080547] [c0000007ff6ebb50] [c000000000d5a740]
ip_local_deliver_finish+0x70/0x90
[ 2745.080566] [c0000007ff6ebb70] [c000000000d5a7ec]
ip_local_deliver+0x8c/0x140
[ 2745.080585] [c0000007ff6ebbe0] [c000000000d59aec] ip_rcv_finish+0xbc/0xf0
[ 2745.080602] [c0000007ff6ebc20] [c000000000d5a9a0] ip_rcv+0x100/0x110
[ 2745.080619] [c0000007ff6ebca0] [c000000000cab220]
__netif_receive_skb_one_core+0x70/0xb0
[ 2745.080638] [c0000007ff6ebce0] [c000000000cac4f0]
process_backlog+0xd0/0x230
[ 2745.080657] [c0000007ff6ebd50] [c000000000cadc68] net_rx_action+0x1e8/0x520
[ 2745.080674] [c0000007ff6ebe70] [c000000000ee2a7c] __do_softirq+0x15c/0x3b8
[ 2745.080692] [c0000007ff6ebf90] [c000000000030678] call_do_softirq+0x14/0x24
[ 2745.080709] [c00000070656f7c0] [c00000000001bf58]
do_softirq_own_stack+0x38/0x50
[ 2745.080729] [c00000070656f7e0] [c000000000143d60]
do_softirq.part.0+0x80/0xb0
[ 2745.080914] [c00000070656f810] [c000000000143e54]
__local_bh_enable_ip+0xc4/0xf0
[ 2745.080933] [c00000070656f830] [c000000000d5f8fc]
ip_finish_output2+0x1fc/0x740
[ 2745.080953] [c00000070656f8d0] [c000000000d61fe4] ip_output+0xd4/0x190
[ 2745.080971] [c00000070656f960] [c000000000d61444] ip_local_out+0x64/0x90
[ 2745.080988] [c00000070656f9a0] [c000000000d61838]
__ip_queue_xmit+0x168/0x4d0
[ 2745.081007] [c00000070656fa30] [c000000000d90a3c] ip_queue_xmit+0x1c/0x30
[ 2745.081024] [c00000070656fa50] [c000000000d887e4]
__tcp_transmit_skb+0x574/0xda0
[ 2745.081044] [c00000070656fb00] [c000000000d89a88] tcp_connect+0x4b8/0x600
[ 2745.081060] [c00000070656fbb0] [c000000000d93148]
tcp_v4_connect+0x478/0x5b0
[ 2745.082755] [c00000070656fc40] [c000000000db876c]
__inet_stream_connect+0x12c/0x4c0
[ 2745.084563] [c00000070656fcf0] [c000000000db8b5c]
inet_stream_connect+0x5c/0x90
[ 2745.085528] [c00000070656fd30] [c000000000c75dec] __sys_connect+0x11c/0x160
[ 2745.086424] [c00000070656fe00] [c000000000c75e58] sys_connect+0x28/0x40
[ 2745.087343] [c00000070656fe20] [c00000000000b278] system_call+0x5c/0x68
[ 2745.089157] Instruction dump:
[ 2745.089169] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX
[ 2745.090048] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX
[ 2745.096394] ---[ end trace d347ca85a257c66f ]---
[ 2745.208020]
[ 2746.208219] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 274[ 2796.226294116,5] OPAL: Reboot request...
6.316857] Rebooting in 10 seconds..
The final ADT test output recorded was:
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
17:03:13 DEBUG| [stdout] # Socket 2: 2
17:03:13 DEBUG| [stdout] # Socket 3: 3
... etc ...
17:03:13 DEBUG| [stdout] # Socket 4: 4
17:03:13 DEBUG| [stdout] # Socket 5: 5
17:03:13 DEBUG| [stdout] # Socket 9: 19
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Socket 3: 18
17:03:13 DEBUG| [stdout] # Socket 4: 19
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Socket 4: 19
17:03:13 DEBUG| [stdout] # Testing too many filters...
17:03:13 DEBUG| [stdout] # Testing filters on non-SO_REUSEPORT socket...
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP w/ mapped IPv4 ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing filter add without bind...
17:03:13 DEBUG| [stdout] # SUCCESS
17:03:13 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
17:03:13 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
17:03:13 DEBUG| [stdout] # ---- IPv4 UDP ----
17:03:13 DEBUG| [stdout] # send cpu 0, receive socket 0
17:03:13 DEBUG| [stdout] # send cpu 1, receive socket 1
...
17:03:13 DEBUG| [stdout] # send cpu 125, receive socket 125
17:03:13 DEBUG| [stdout] # send cpu 127, receive socket 127
17:03:13 DEBUG| [stdout] # ---- IPv4 TCP ----
[ end of output as machine panic's ]
..so it occurred sometime around or after this. I'll re-run this with
the ipmi tool on the console to see if I can see how far it got before
the kernel panic'd.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1855151/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp