Hello! I've upgraded the kernel from 2.6.10 to 2.6.11 (both from Fedora Core 3 RPMs) on one of my servers, based on a Tyan S2882 board with two Opteron 248's and two 1Gb modules inserted into the slots next to *second* CPU (they just happened to be there). And the kernel didn't booted. I've also tried to compile the kernel myself -- with same result.
I'll skip the whole detective story and will describe just the net result. It looks like the bug is somewhere in the NUMA handling code, because: a) The kernel boots fine with numa=off b) The kernel started working even without numa=off after I've moved the DRAM from the slots bound to CPU#2 to the slots bound to CPU#1. c) The output with kernel 2.6.10 (which worked fine even with DRAM in slots #2) was as follows: Bootdata ok (command line is ro root=/dev/hda1 selinux=0) Linux version 2.6.10-1.741_FC3smp ([EMAIL PROTECTED]) (gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)) #1 SMP Thu Jan 13 16:58:29 EST 2005 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007fff0000 (usable) BIOS-e820: 000000007fff0000 - 000000007ffff000 (ACPI data) BIOS-e820: 000000007ffff000 - 0000000080000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) Scanning NUMA topology in Northbridge 24 Number of nodes 2 (10010) Skipping disabled node 0 Node 1 MemBase 0000000000000000 Limit 000000007fff0000 Using node hash shift of 24 Bootmem setup node 1 0000000000000000-000000007fff0000 No mptable found. ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:5 APIC version 16 ... --------------------------------------------------- Kernel 2.6.11 with numa=off: Linux version 2.6.11-1.7_FC3smp ([EMAIL PROTECTED]) (gcc version 3.4.3 20050227 (Red Hat 3.4.3-22)) #1 SMP Sat Mar 19 20:16:43 EST 2005 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007fff0000 (usable) BIOS-e820: 000000007fff0000 - 000000007ffff000 (ACPI data) BIOS-e820: 000000007ffff000 - 0000000080000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) NUMA turned off Faking a node at 0000000000000000-000000007fff0000 Bootmem setup node 0 0000000000000000-000000007fff0000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:5 APIC version 16 ... --------------------------------------------------- Kernel 2.6.11 with enabled numa, crashing (transcribed off the screen with the earlyprintk=vga option): Linux version 2.6.11-1.7_FC3smp ([EMAIL PROTECTED]) (gcc version 3.4.3 20050227 (Red Hat 3.4.3-22)) #1 SMP Sat Mar 19 20:16:43 EST 2005 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007fff0000 (usable) BIOS-e820: 000000007fff0000 - 000000007ffff000 (ACPI data) BIOS-e820: 000000007ffff000 - 0000000080000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) kernel direct mapping tables upto ffff810100000000 @ 8000-d000 Scanning NUMA topology in Northbridge 24 Number of nodes 2 Skipping disabled node 0 Node 1 MemBase 0000000000000000-000000007fff0000 Using node hash shift of 24 Bootmem setup node 1 0000000000000000-000000007fff0000 PANIC: early exception rip ffffffff8020427b error 0 cr2 0 ... The address ffffffff8020427b is: ffffffff80204210 T find_next_zero_bit ffffffff802042a0 T find_next_bit Is this information enough for catching the bug? I'll do my best to gather more info although it's a production server. -- Greetings, Andrew P.S. Please Cc: to me. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/