> On Jan 22, 2019, at 10:43, stan <stanl-fedorau...@vfemail.net> wrote:
> 
> On Mon, 21 Jan 2019 18:48:04 -0500
> Nate Pearlstein <darkna...@gmail.com> wrote:
> 
>> I normally run w/o quiet and rhgb anyway.  I added earlyprintk=vga
>> and it’s clear the system panics early.  I tried adding
>> boot_delay=500 and also boot_delay=10 to try to capture the spew with
>> my phone camera capturing at 60fps.  Only leaving off boot_delay can
>> I see the panic but the output is coming faster than 60fps.
>> 
>> From what I can piece together without using a serial console and
>> capturing from another host:
>> 
>> kernel BUG at mm/page_alloc.c:791!
>> Invalid opcode: 0000 [#10 SMP PTI] (not sure about this too jumbled)
>> I can’t really see the stack trace either
>> __free_page_ok
>> free_all_bootmem
>> mem_init
>> start_kernel
>> secondary_startup_64
>> [1.860030] free_one_page RIP: 0010:free_one_page
>> [1.863221] Code: 08 0e 03 00 0f 0b 48 89 da be 0c 00 00 00 4c 89 ff
>> e8 56 02 00 e9 9c fb ff ff 48 c7 c6 08 86 0d 92 4c 89 f7 e8 e2 0d 03
>> 00 <0f> ob 48 c6 30 86 0d 92 48 89 df e8 d1 0d 03 00 0f 0b 31 d2 e9
>> [1.872806] RSP: 0000:ffffffff92203e20 EFLAGS: 00010046
>> .
>> .
>> [1.923827] Kernel panic - not syncing
> 
> Samuel might be able to decipher this, but I have an off the wall idea.
> Kernels get bigger with each release.  I wonder if there is a memory
> problem, that the earlier kernels don't trigger, but the larger kernels
> do.  Run a memory test?
> 
> The other thing to try is re-installing the kernel.  A really long
> shot, but worth a try.
> 
> And maybe it is a kernel bug.  The line you are referring to is
>    VM_BUG_ON_PAGE(bad_range(zone, page), page);
> and it occurs when trying to deallocate a page.
> 
> static inline void __free_one_page(struct page *page,
>        unsigned long pfn,
>        struct zone *zone, unsigned int order,
>        int migratetype)
> {
> 
> I interpret the errors as saying that the kernel is trying to
> deallocate a page, and the CPU receives a 0000 opcode.  That would be
> an error.  But is it coming from the kernel, or is the kernel reading a
> bad location?
> 
> I think it has to be something about your hardware, because if the
> kernel was actually having trouble deallocating pages for all boots,
> this would be a well known problem.  Maybe you have hit a corner case.
> You could open a bugzilla, but it will be difficult for someone to fix
> this without your hardware to replicate the crash or the complete crash
> output.
> 
> The 4.20 kernel series is not far away from coming to stable.  You
> could either grab one from koji,
> https://koji.fedoraproject.org/koji/packageinfo?packageID=8
> or use an older kernel until it is released.  It might fix the issue as
> a side effect of other changes.
> _______________________________________________

List ate my reply, it was too long included entire console output 

Ok, broke out the old Keyspan:


This is on 4.20.3-200.fc29.x86_64

Probing EDD (edd=off to disable)... ok
[    0.000000] microcode: microcode updated early to revision 0x1f, date = 
2018-05-08
[    0.000000] Linux version 4.20.3-200.fc29.x86_64 
(mockbu...@bkernel04.phx2.fedoraproject.org) (gcc version 8.2.1 20181215 (Red 
Hat 8.2.1-6) (GCC)) #1 SMP Thu Jan 17 15:19:35 UTC 2019
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.20.3-200.fc29.x86_64 
root=UUID=f16fcae3-fe27-4314-afa3-42deec5f378c ro rootflags=subvol=btrfsroot1 
rd.driver.blacklist=nouveau rd.lvm=0 rd.dm=0 
rd.md.uuid=bfe3028c:482de62e:e81670f7:c1d008bf 
rd.luks.uuid=luks-c680a803-db0f-423b-8fb4-5ac67c7e141b 
rd.luks.allow-discards=luks-c680a803-db0f-423b-8fb4-5ac67c7e141b 
rd.md.uuid=bdc6f872:46939f0e:7e802862:035eb1f1 
rd.luks.uuid=luks-ab55575a-5048-445b-830a-3cdcb78222b6 
rd.luks.allow-discards=luks-ab55575a-5048-445b-830a-3cdcb78222b6 
rd.luks.uuid=luks-2c2515c2-e6a3-438c-9514-0aa9ddd2a1b3 
rd.luks.allow-discards=luks-2c2515c2-e6a3-438c-9514-0aa9ddd2a1b3 
vconsole.keymap=us crashkernel=128M usbcore.autosuspend=-1 console=tty0 
console=ttyS0,115200 
...
[    2.073219] page 0xc24000 outside node 1 zone Normal [ 0x100000 - 0xc24000 ]
[    2.080074] page:fffff8ebf0900000 count:0 mapcount:0 
mapping:0000000000000000 index:0x0
[    2.088036] flags: 0x57fffe00000000()
[    2.091671] raw: 0057fffe00000000 fffff8ebf0900008 fffff8ebf0900008 
0000000000000000
[    2.099373] raw: 0000000000000000 0000000000000000 00000000ffffffff 
0000000000000000
[    2.107075] page dumped because: VM_BUG_ON_PAGE(bad_range(zone, page))
[    2.113573] ------------[ cut here ]------------
[    2.118154] kernel BUG at mm/page_alloc.c:798!
[    2.122570] invalid opcode: 0000 [#1] SMP PTI
[    2.126895] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.3-200.fc29.x86_64 #1
[    2.133991] Hardware name: Dell Inc. Precision WorkStation T7500  /06FW8P, 
BIOS A17 03/11/2018
[    2.142563] RIP: 0010:free_one_page+0x50e/0x540
[    2.147060] Code: 08 16 03 00 0f 0b 48 89 da be 0c 00 00 00 4c 89 ff e8 56 
07 02 00 e9 9c fb ff ff 48 c7 c6 18 f8 0d 89 4c 89 f7 e8 e2 15 03 00 <0f> 0b 48 
c7 c6 40 f8 0d 89 48 89 df e8 d1 15 03 00 0f 0b 31 d2 e9
[    2.165754] RSP: 0000:ffffffff89203e20 EFLAGS: 00010046
[    2.170946] RAX: 000000000000003a RBX: 0000000000000400 RCX: ffffffff89254668
[    2.178043] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
[    2.185140] RBP: 0000000000c24000 R08: 6d75642065676170 R09: 6163656220646570
[    2.192236] R10: 7375616365622064 R11: 55425f4d56203a65 R12: 000000000000000a
[    2.199334] R13: 00000000000003ff R14: fffff8ebf0900000 R15: ffffa04423fd5d00
[    2.206430] FS:  0000000000000000(0000) GS:ffffa04ff3400000(0000) 
knlGS:0000000000000000
[    2.214480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.220191] CR2: ffffa04e49e01000 CR3: 000000164920a001 CR4: 00000000000206b0
[    2.227288] Call Trace:
[    2.229715]  __free_pages_ok+0x15c/0x440
[    2.233610]  memblock_free_all+0x127/0x192
[    2.237676]  mem_init+0x1b/0xb9
[    2.240792]  start_kernel+0x293/0x528
[    2.244427]  secondary_startup_64+0xa4/0xb0
[    2.248579] Modules linked in:
[    2.251625] ---[ end trace ffc919177d0487be ]---
[    2.256195] RIP: 0010:free_one_page+0x50e/0x540
[    2.260695] Code: 08 16 03 00 0f 0b 48 89 da be 0c 00 00 00 4c 89 ff e8 56 
07 02 00 e9 9c fb ff ff 48 c7 c6 18 f8 0d 89 4c 89 f7 e8 e2 15 03 00 <0f> 0b 48 
c7 c6 40 f8 0d 89 48 89 df e8 d1 15 03 00 0f 0b 31 d2 e9
[    2.279388] RSP: 0000:ffffffff89203e20 EFLAGS: 00010046
[    2.284581] RAX: 000000000000003a RBX: 0000000000000400 RCX: ffffffff89254668
[    2.291678] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
[    2.298775] RBP: 0000000000c24000 R08: 6d75642065676170 R09: 6163656220646570
[    2.305872] R10: 7375616365622064 R11: 55425f4d56203a65 R12: 000000000000000a
[    2.312968] R13: 00000000000003ff R14: fffff8ebf0900000 R15: ffffa04423fd5d00
[    2.320066] FS:  0000000000000000(0000) GS:ffffa04ff3400000(0000) 
knlGS:0000000000000000
[    2.328114] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.333826] CR2: ffffa04e49e01000 CR3: 000000164920a001 CR4: 00000000000206b0
[    2.340924] Kernel panic - not syncing: Attempted to kill the idle task!
[    2.347618] ---[ end Kernel panic - not syncing: Attempted to kill the idle 
task! ]
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to