On Sun, Sep 04, 2016 at 11:19:16AM +0300, Andriy Gapon wrote: > On 01/09/2016 15:13, Slawa Olhovchenkov wrote: > > DMAR: Found table at 0x79b32798 > > x2APIC available but disabled by DMAR table > > > Event timer "LAPIC" quality 600 > > LAPIC: ipi_wait() us multiplier 1 (r 116268019 tsc 2200043851) > > ACPI APIC Table: <ALASKA A M I > > > Package ID shift: 5 > > L3 cache ID shift: 5 > > L2 cache ID shift: 1 > > L1 cache ID shift: 1 > > Core ID shift: 1 > > kernel trap 12 with interrupts disabled > > > > > > Fatal trap 12: page fault while in kernel mode > > cpuid = 0; apic id = ff > > > fault virtual address = 0x0 > > fault code = supervisor read data, page not present > > instruction pointer = 0x20:0xffffffff80537e74 > > stack pointer = 0x28:0xffffffff814b4a60 > > frame pointer = 0x28:0xffffffff814b4a70 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = resume, IOPL = 0 > > current process = 0 () > > trap number = 12 > > panic: page fault > > cpuid = 0 > > KDB: stack backtrace: > > #0 0xffffffff805272e7 at kdb_backtrace+0x67 > > #1 0xffffffff804dd662 at vpanic+0x182 > > #2 0xffffffff804dd4d3 at panic+0x43 > > #3 0xffffffff807a3791 at trap_fatal+0x351 > > #4 0xffffffff807a3983 at trap_pfault+0x1e3 > > #5 0xffffffff807a2f0c at trap+0x26c > > #6 0xffffffff80787ca1 at calltrap+0x8 > > #7 0xffffffff8083b52a at topo_probe+0x61a > > Interesting. Could you please do 'list *topo_probe+0x61a' in kgdb, so that I
(kgdb) list *topo_probe+0x61a 0xffffffff8083b52a is in topo_probe (/usr/src/sys/x86/x86/mp_x86.c:540). 535 topo_layers[layer].subtype); 536 } 537 } 538 539 parent = &topo_root; 540 for (layer = 0; layer < nlayers; ++layer) { 541 node_id = boot_cpu_id >> topo_layers[layer].id_shift; 542 node = topo_find_node_by_hwid(parent, node_id, 543 topo_layers[layer].type, 544 topo_layers[layer].subtype); Current language: auto; currently minimal > can see what code is being executed when the trap happens? Also, disassembly > of > the function could be useful as well. (kgdb) x/40i *topo_probe+0x600 0xffffffff8083b510 <topo_probe+1536>: and $0xf8,%al 0xffffffff8083b512 <topo_probe+1538>: movslq -0x4(%r12),%rcx 0xffffffff8083b517 <topo_probe+1543>: mov %rbx,%rdi 0xffffffff8083b51a <topo_probe+1546>: callq 0xffffffff80537e30 <topo_find_node_by_hwid> 0xffffffff8083b51f <topo_probe+1551>: mov %rax,%rbx 0xffffffff8083b522 <topo_probe+1554>: mov %rbx,%rdi 0xffffffff8083b525 <topo_probe+1557>: callq 0xffffffff80537e70 <topo_promote_child> 0xffffffff8083b52a <topo_probe+1562>: add $0xc,%r12 0xffffffff8083b52e <topo_probe+1566>: dec %r14d 0xffffffff8083b531 <topo_probe+1569>: jne 0xffffffff8083b500 <topo_probe+1520> 0xffffffff8083b533 <topo_probe+1571>: movb $0x1,0xffffffff80dfa664 0xffffffff8083b53b <topo_probe+1579>: add $0x68,%rsp 0xffffffff8083b53f <topo_probe+1583>: pop %rbx 0xffffffff8083b540 <topo_probe+1584>: pop %r12 0xffffffff8083b542 <topo_probe+1586>: pop %r13 0xffffffff8083b544 <topo_probe+1588>: pop %r14 0xffffffff8083b546 <topo_probe+1590>: pop %r15 0xffffffff8083b548 <topo_probe+1592>: pop %rbp 0xffffffff8083b549 <topo_probe+1593>: retq 0xffffffff8083b54a <topo_probe+1594>: nopw 0x0(%rax,%rax,1) > Wait... > Kostik, I see one strange thing which is common to both successful and > unsuccessful configurations. All "SMP: Added CPU..." lines have "AP" in them. for #1..#23 no line 'SMP: AP CPU #0 Launched!' > It seems like the platform does not tell explicitly tell which CPU is the BSP, > see cpu_add() function. This can break quite a few assumption. And I am not > even sure how the successful scenario works. # mptable =============================================================================== MPTable ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000fd050 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x27 mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000fcaa0 signature: 'PCMP' base table length: 1228 version: 1.4 checksum: 0x95 OEM ID: 'A M I' Product ID: 'ALASKA' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 112 local APIC address: 0xfee00000 extended table length: 220 extended table checksum: 72 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 0 0x15 BSP, usable 6 15 1 0xbfebfbff 2 0x15 AP, usable 6 15 1 0xbfebfbff 4 0x15 AP, usable 6 15 1 0xbfebfbff 6 0x15 AP, usable 6 15 1 0xbfebfbff 8 0x15 AP, usable 6 15 1 0xbfebfbff 10 0x15 AP, usable 6 15 1 0xbfebfbff 16 0x15 AP, usable 6 15 1 0xbfebfbff 18 0x15 AP, usable 6 15 1 0xbfebfbff 20 0x15 AP, usable 6 15 1 0xbfebfbff 22 0x15 AP, usable 6 15 1 0xbfebfbff 24 0x15 AP, usable 6 15 1 0xbfebfbff 26 0x15 AP, usable 6 15 1 0xbfebfbff 32 0x15 AP, usable 6 15 1 0xbfebfbff 34 0x15 AP, usable 6 15 1 0xbfebfbff 36 0x15 AP, usable 6 15 1 0xbfebfbff 38 0x15 AP, usable 6 15 1 0xbfebfbff 40 0x15 AP, usable 6 15 1 0xbfebfbff 42 0x15 AP, usable 6 15 1 0xbfebfbff 48 0x15 AP, usable 6 15 1 0xbfebfbff 50 0x15 AP, usable 6 15 1 0xbfebfbff 52 0x15 AP, usable 6 15 1 0xbfebfbff 54 0x15 AP, usable 6 15 1 0xbfebfbff 56 0x15 AP, usable 6 15 1 0xbfebfbff 58 0x15 AP, usable 6 15 1 0xbfebfbff > Ah... I see that there is a backup code in cpu_mp_start() where boot_cpu_id is > set based on the current CPU's Local APIC ID. I suspect then that this > information is incorrect in the failing case. > > Slawa, > my guess can be checked by adding a printf to cpu_mp_start() right after > boot_cpu_id assignment. System now in early production and I can't be reboot often. > > #8 0xffffffff8078fe81 at cpu_mp_start+0x1b1 > > #9 0xffffffff805382ca at mp_start+0x3a > > #10 0xffffffff80465cd8 at mi_startup+0x118 > > #11 0xffffffff8028dfac at btext+0x2c > > Uptime: 1s > > > -- > Andriy Gapon _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"