[Bug 1762844] Comment bridged from LTC Bugzilla

bugproxy Tue, 24 Apr 2018 08:02:06 -0700

------- Comment From kla...@br.ibm.com 2018-04-24 10:42 EDT-------
(In reply to comment #166)
> this point, I don't see a connection to KVM or even Ubuntu vs. Pegas. This
> appears to be something that will happen in any distro that has the right
> vintage of qla2xxx driver. Not sure why we think this did not happen on
> Ubuntu -13 kernel - I see nothing in the diffs of qla2xxx that would affect
> this.


What is puzzling is that we had multiple reproduces (with guests,
without guests) prior to Dwip's patch.

With Dwip's patch, which is arguably not covering all scenarios, we
didn't get any repro

Without Dwip's patch, but reverting 165988, so far it looks clean as
well. I couldn't find a definitive link between reverting 165988 and
running clear for this testcase, other than speculating that for this
system, "cpu_present_mask" is different from "cpu_possible_mask", or
that there are other changes in how IRQs are distributed that I'm not
seeing exactly how.

I'd feel more comfortable having another system reproducing the original
issue, where we can debug/experiment better, while allowing boslcp3 to
continue the test run with the current kernel (it is the closest thing
we have from what will appear in GA anyway I think).

------- Comment From dnban...@us.ibm.com 2018-04-24 10:50 EDT-------
When we first started looking at this bug, the captured issue seemed
slightly different - a case where an skb allocation appeared to be
failing due to (what appeared to be) a corrupted slab cache.

Thereafter we got a series of kworker thread related failures which
have been diagnosed above. However, while looking at those crashes
I chanced upon an instance that seems more related to  the corrputed
slab cache.

I decided to dig a bit to see if those instances are clearly
corelated (any relation?) or if there are other things we need to  be
aware of.

There is a crash from sometime on Friday April 20 (likely on the wrk_dbg
kernel that just had the debug object cinfiguration turned on).

The stack trace was a little different and it worked with the stock
kernel ...

KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-15-generic
DUMPFILE: dump.201804201534  [PARTIAL DUMP]
CPUS: 160
DATE: Fri Apr 20 15:33:10 2018
UPTIME: 00:03:51
LOAD AVERAGE: 3.22, 0.76, 0.25
TASKS: 1791
NODENAME: boslcp3
RELEASE: 4.15.0-15-generic
VERSION: #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018
MACHINE: ppc64le  (2134 Mhz)
MEMORY: 128 GB
PANIC: "Unable to handle kernel paging request for data at address 
0x26eed6a1145b0a2a"
PID: 5874
COMMAND: "systemd-udevd"
TASK: c000000fe6482e00  [THREAD_INFO: c000000fe9394000]
CPU: 80
STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 5874   TASK: c000000fe6482e00  CPU: 80  COMMAND: "systemd-udevd"
#0 [c000000fe93975a0] crash_kexec at c0000000001e3950
#1 [c000000fe93975e0] oops_end at c000000000025888
#2 [c000000fe9397660] bad_page_fault at c00000000006a900
#3 [c000000fe93976d0] slb_miss_bad_addr at c000000000027764
#4 [c000000fe93976f0] bad_addr_slb at c000000000008a1c
Data SLB Access [380] exception frame:
R0:  c000000000389874    R1:  c000000fe93979e0    R2:  c0000000016eb400
R3:  0000000000000001    R4:  00e608fe511d18e9    R5:  00000000000003cc
^^^^^^BAD^^^^^^^^^^^^
R6:  0000000000000001    R7:  00000000000003cb    R8:  e608fe511d1854c2
^^^^^^BAD^^^^^^^^^^^^
R9:  0000000000000000    R10: 0000000000000000    R11: 00000000000000f1
R12: 0000000000002000    R13: c000000007a57000    R14: c000000fdc28f080
R15: 0000000000000000    R16: 0000000000000001    R17: c000000fae88d800
R18: 0000000000000002    R19: 0000000000000000    R20: 0000000000000000
R21: 0000000000000000    R22: 0000000000000000    R23: c000000001621200
R24: e6eef6af4c054c2b    R25: c000200e585e4601    R26: 26eed6a1145b0a2a
^^^^^^^BAD^^^^^^^^^^^
^^^^^^^ptr^^^^^^^^^^^    ^^^^freelist_ptr^^^^^
R27: c000000000b32514    R28: c000000ff901ee00    R29: 00000000014000c0
^^^^kmem_cache^^^^^^     ^^^^gfpflags^^^^^^^^^
R30: c000200e585e4601    R31: c000000ff901ee00
^^^object^^^^^^^^^^^
NIP: c0000000003899a0    MSR: 9000000000009033    OR3: c000000000016e1c
CTR: 0000000000000000    LR:  c00000000038998c    XER: 0000000000000000
CCR: 0000000028002808    MQ:  0000000000000001    DAR: 26eed6a1145b0a2a
DSISR: 0000000000000000     Syscall Result: 0000000000000000
#5 [c000000fe93979e0] kmem_cache_alloc at c0000000003899a0
[Link Register] [c000000fe93979e0] kmem_cache_alloc at c00000000038998c  
(unreliable)
#6 [c000000fe9397a40] skb_clone at c000000000b32514
#7 [c000000fe9397a70] netlink_broadcast_filtered at c000000000ba84a0
#8 [c000000fe9397b30] netlink_sendmsg at c000000000babae4
#9 [c000000fe9397bc0] sock_sendmsg at c000000000b1ec64
#10 [c000000fe9397bf0] ___sys_sendmsg at c000000000b20abc
#11 [c000000fe9397d90] __sys_sendmsg at c000000000b221ec
#12 [c000000fe9397e30] system_call at c00000000000b184
System Call [c00] exception frame:
R0:  0000000000000155    R1:  00007fffe05d2e20    R2:  00007b8f72337f00
R3:  000000000000000e    R4:  00007fffe05d2ec8    R5:  0000000000000000
R6:  0000000000000000    R7:  00000000000000be    R8:  000000005bd10000
R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000
R12: 0000000000000000    R13: 00007b8f723cdaa0    R14: 00000f943ed71fd0
R15: 00000f943ed71cc0    R16: 00000f943ed71fa0    R17: 00000f942bd402c4
R18: 0000000000000003    R19: 0000000000000004    R20: 00007fffe05d3170
R21: 00007fffe05d3688    R22: 00007fffe05d3a88    R23: 0000000000000000
R24: 00000f943ed7a4b0    R25: 0000000000000000    R26: 000000005bd1e995
R27: 00000f943ed7a4b0    R28: 00000f943ed79b00    R29: 00000000000000c9
R30: 00000f943ed71cc0    R31: 0000000000000000
NIP: 00007b8f7230a940    MSR: 900000000000f033    OR3: 000000000000000e
CTR: 0000000000000000    LR:  00000f942bcc4650    XER: 0000000000000000
CCR: 0000000048002402    MQ:  0000000000000001    DAR: 00000f943f026c78
DSISR: 000000000a000000     Syscall Result: 0000000000000000

Here is the relevant part of kmem_cache_alloc:

0xc0000000003896f4 <kmem_cache_alloc+0x34>:     mr      r29,r4
/build/linux-QzAGR9/linux-4.15.0/mm/slab.h: 414
0xc0000000003896f8 <kmem_cache_alloc+0x38>:     addis   r9,r2,4
0xc0000000003896fc <kmem_cache_alloc+0x3c>:     addi    r9,r9,-32684
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2737
0xc000000000389700 <kmem_cache_alloc+0x40>:     mr      r28,r3 <--- kmem_cache
/build/linux-QzAGR9/linux-4.15.0/mm/slab.h: 414
0xc000000000389704 <kmem_cache_alloc+0x44>:     lwz     r31,0(r9)
0xc000000000389708 <kmem_cache_alloc+0x48>:     and     r31,r31,r4
/build/linux-QzAGR9/linux-4.15.0/mm/slab.h: 419
0xc00000000038970c <kmem_cache_alloc+0x4c>:     andis.  r9,r31,64
0xc000000000389710 <kmem_cache_alloc+0x50>:     bne     0xc000000000389870 
<kmem_cache_alloc+0x1b0>
/build/linux-QzAGR9/linux-4.15.0/arch/powerpc/include/asm/jump_label.h: 24
0xc000000000389714 <kmem_cache_alloc+0x54>:     nop
/build/linux-QzAGR9/linux-4.15.0/mm/slab.h: 428
0xc000000000389718 <kmem_cache_alloc+0x58>:     mr      r31,r28
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2652
0xc00000000038971c <kmem_cache_alloc+0x5c>:     cmpdi   cr7,r31,0
0xc000000000389720 <kmem_cache_alloc+0x60>:     beq     cr7,0xc0000000003899f0 
<kmem_cache_alloc+0x330>
0xc000000000389724 <kmem_cache_alloc+0x64>:     std     r24,32(r1)
0xc000000000389728 <kmem_cache_alloc+0x68>:     std     r25,40(r1)
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2666
0xc00000000038972c <kmem_cache_alloc+0x6c>:     ld      r9,0(r31)
0xc000000000389730 <kmem_cache_alloc+0x70>:     ld      r8,48(r13)
0xc000000000389734 <kmem_cache_alloc+0x74>:     addi    r9,r9,8
/build/linux-QzAGR9/linux-4.15.0/include/linux/compiler.h: 183
0xc000000000389738 <kmem_cache_alloc+0x78>:     ldx     r5,r9,r8
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2667
0xc00000000038973c <kmem_cache_alloc+0x7c>:     ld      r10,48(r13)
0xc000000000389740 <kmem_cache_alloc+0x80>:     ld      r9,0(r31)

2667                 c = raw_cpu_ptr(s->cpu_slab);
local_paca->data_offset

The data_offset:

struct paca_struct {
lppaca_ptr = 0xc000000007833c00,
paca_index = 0x50,
lock_token = 0x8000,
kernel_toc = 0xc0000000016eb400,
kernelbase = 0xc000000000000000,
kernel_msr = 0xb000000000001033,
emergency_sp = 0xc000000007be4000,
data_offset = 0x200e5f6e0000,
..

crash> struct kmem_cache.cpu_slab c000000ff901ee00
cpu_slab = 0xc0000000011d9d40 <mach_powernv+296>

adding that to the data_offset, we get the kmem_cache_cpu

crash> rd c000200e608b9d40
c000200e608b9d40:  26eed6a1145b0a2a                    *.[....&
^^^PROBLEM^^^^^^

PANIC: "Unable to handle kernel paging request for data at address
0x26eed6a1145b0a2a"

R24: e6eef6af4c054c2b    R25: c000200e585e4601    R26: 26eed6a1145b0a2a

crash> rd c000200e608b9d40 16 <<<--- kmem_cache_cpu
c000200e608b9d40:  26eed6a1145b0a2a 00000000000003cc   *.[....&........
c000200e608b9d50:  c00a000803961780 0000000000000000   ................
c000200e608b9d60:  c000200e585af200 00000000000000a5   ..ZX. ..........
c000200e608b9d70:  c00a000803961680 0000000000000000   ................

crash> page 0xc00a000803961780
struct page {
flags = 0x81ffffc00000100,
{
mapping = 0x0,
s_mem = 0x0,
compound_mapcount = {
counter = 0x0
}
},
{
index = 0xc000200e585e9c00,
freelist = 0xc000200e585e9c00
},
{
counters = 0x8100007b,
{
{
_mapcount = {
counter = 0x8100007b
},
active = 0x8100007b,
{
inuse = 0x7b,
objects = 0x100,
frozen = 0x1
},
units = 0x8100007b
},
_refcount = {
counter = 0x1
}
}
},
{
lru = {
next = 0x5deadbeef0000100,
prev = 0x5deadbeef0000200
},
pgmap = 0x5deadbeef0000100,
{
next = 0x5deadbeef0000100,
pages = 0xf0000200,
pobjects = 0x5deadbee
},
callback_head = {
next = 0x5deadbeef0000100,
func = 0x5deadbeef0000200
},
{
compound_head = 0x5deadbeef0000100,
compound_dtor = 0xf0000200,
compound_order = 0x5deadbee
}
},
{
private = 0xc000000ff901ee00,
ptl = {
{
rlock = {
raw_lock = {
slock = 0xf901ee00
}
}
}
},
slab_cache = 0xc000000ff901ee00 <<<--- Correct cache
},
mem_cgroup = 0x0
}

Back to kmem_cache_alloc:
0xc000000000389744 <kmem_cache_alloc+0x84>:     add     r7,r9,r10
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2688
0xc000000000389748 <kmem_cache_alloc+0x88>:     ldx     r30,r9,r10 <<<--- r30 = 
object

2688         object = c->freelist;

/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2689
0xc00000000038974c <kmem_cache_alloc+0x8c>:     ld      r9,16(r7)

2689         page = c->page;

/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2690
0xc000000000389758 <kmem_cache_alloc+0x98>:     beq     cr5,0xc0000000003897e0 
<kmem_cache_alloc+0x120>
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2353
0xc00000000038975c <kmem_cache_alloc+0x9c>:     beq     cr7,0xc0000000003897e0 
<kmem_cache_alloc+0x120>
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 269
0xc000000000389760 <kmem_cache_alloc+0xa0>:     lwa     r9,32(r31) <<<--r9 
=s->offset
crash> struct kmem_cache.offset
struct kmem_cache {
[0x20] int offset;
}

/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 253
0xc000000000389764 <kmem_cache_alloc+0xa4>:     ld      r24,320(r31) <<<<-- r24 
= s->random
crash> struct kmem_cache.random
struct kmem_cache {
[0x140] unsigned long random;
}

/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 269
0xc000000000389768 <kmem_cache_alloc+0xa8>:     add     r25,r30,r9
return freelist_dereference(s, object + s->offset)

build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 253
0xc00000000038997c <kmem_cache_alloc+0x2bc>:    xor     r26,r25,r24 <<<-- r26 
has freelist_ptr
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 2710
0xc000000000389980 <kmem_cache_alloc+0x2c0>:    std     r26,0(r9)
0xc000000000389984 <kmem_cache_alloc+0x2c4>:    std     r5,0(r10)
0xc000000000389988 <kmem_cache_alloc+0x2c8>:    bl      0xc000000000016e08 
<arch_local_irq_restore+0x8>
0xc00000000038998c <kmem_cache_alloc+0x2cc>:    nop
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 274

prefetch_freepointer
if (object)
prefetch(freelist_dereference(s, object + s->offset));
return freelist_ptr(s, (void *)*(unsigned long *)(ptr_addr), (unsigned 
long)ptr_addr)
return (void *)((unsigned long)ptr ^ s->random ^ ptr_addr)

0xc000000000389990 <kmem_cache_alloc+0x2d0>:    cmpld   cr7,r25,r24
0xc000000000389994 <kmem_cache_alloc+0x2d4>:    beq     cr7,0xc0000000003899bc 
<kmem_cache_alloc+0x2fc>
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 275
0xc000000000389998 <kmem_cache_alloc+0x2d8>:    lwa     r9,32(r31) <<-- 
r9=s->offset
/build/linux-QzAGR9/linux-4.15.0/mm/slub.c: 253
0xc00000000038999c <kmem_cache_alloc+0x2dc>:    ld      r8,320(r31) <<-- 
r8=s->random
0xc0000000003899a0 <kmem_cache_alloc+0x2e0>:    ldx     r10,r26,r9 <<<<----  
CRASH!!!

crash> struct kmem_cache c000000ff901ee00
struct kmem_cache {
cpu_slab = 0xc0000000011d9d40 <mach_powernv+296>,
flags = 0x0,
min_partial = 0x5,
size = 0x100,
object_size = 0x100,
offset = 0x0,
cpu_partial = 0xd,
oo = {
x = 0x100
},
max = {
x = 0x100
},
min = {
x = 0x100
},
allocflags = 0x0,
refcount = 0x11,
ctor = 0x0,
inuse = 0x100,
align = 0x8,
reserved = 0x0,
red_left_pad = 0x0,
name = 0xc000000000f90a10 "kmalloc-256",
list = {
next = 0xc000000ff901f068,
prev = 0xc000000ff901ec68
},
kobj = {
name = 0xc000000ff0a00120 ":0000256",
entry = {
next = 0xc000000ff901f080,
prev = 0xc000000ff901ec80
},
parent = 0xc000000ff0a107f8,
kset = 0xc000000ff0a107e0,
ktype = 0xc0000000015b6500 <slab_ktype>,
sd = 0xc000000fe1cdde10,
kref = {
refcount = {
refs = {
counter = 0x2
}
}
},
...
}

What happened here is the following:

netlink_sendmsg
-> skb_clone
-> kmem_cache_alloc, using the kmalloc-256 cache:
R28: c000000ff901ee00

The relevant data
==================
R24: e6eef6af4c054c2b    R25: c000200e585e4601    R26: 26eed6a1145b0a2a
^^^^^^^BAD^^^^^^^^^^^
^^^^^^^ptr^^^^^^^^^^^    ^^^^freelist_ptr^^^^^
R27: c000000000b32514    R28: c000000ff901ee00    R29: 00000000014000c0
^^^^kmem_cache^^^^^^
R30: c000200e585e4601    R31: c000000ff901ee00
^^^object^^^^^^^^^^^

The relevant code (at crash time) - just including skeleton code
================================

c = raw_cpu_ptr(s->cpu_slab);
...
object = c->freelist;
page = c->page;
...
void *next_object = get_freepointer_safe(s, object);
...
this_cpu_cmpxchg_double( s->cpu_slab->freelist, s->cpu_slab->tid,
object, tid, next_object, next_tid(tid)))) {
...
prefetch_freepointer(s, next_object);

Basically, the netlink code is cloning an skb which requests a skb header
from 256 byte kmalloc cache (c000000ff901ee00). There we get the
kmem_cache_cpu (for this cpu 0x50) - s->cpu_slab.

We retrieve the freelist pointer into object and the page.
object = c000200e585e4601

Thereafter, we refernce the current object/freepointer to get the
next object (next_object). Prefetching this leads to a problem!
Current freelist_ptr is 26eed6a1145b0a2a which we try to dereference
and crash.

PANIC: "Unable to handle kernel paging request for data at address
0x26eed6a1145b0a2a"

It is to be noted that the object = c000200e585e4601 looks strange.
Almost feels like someone incremented that location (while it is
on the freelist? or could it be freed that way?).

The issue is that the freelist got corrupted and we crash trying
to access it.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1762844

Title:
  ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into
  xmon after moving to 4.15.0-15.16 kernel

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1762844/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1762844] Comment bridged from LTC Bugzilla

Reply via email to