On Wednesday, July 21, 2010 12:44:49 pm Markus Gebert wrote: > > On 21.07.2010, at 14:36, Andriy Gapon wrote: > > > on 21/07/2010 15:25 Markus Gebert said the following: > >> On 21.07.2010, at 10:33, Andriy Gapon wrote: > >> > >>> on 21/07/2010 03:57 Markus Gebert said the following: > >>>> Another thing though: Today I compared verbose boot output from 8-stable > >>>> and the current box. I saw that the ioapic sets up IRQ routing > >>>> differently > >>>> on these two systems although the hardware is the same. This seemed not > >>>> so > >>>> interesting at first, but then I noticed that 8-stable sets up two routes > >>>> (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while > >>>> current > >>>> only uses one route (to lapic0). > >>> My understanding that it's not "two routes", but re-routing. During early > >>> boot all interrupts are bound to BSP; later, when APs become online, the > >>> interrupts are re-distributed among available CPUs. > >> > >> I guess you're right, misinterpretation on my side. Thanks for clarifying > >> this. > >> > >> > >> Now being aware of this, it seems to me that in the > >> machdep.lapic_allclocks=0 > >> case, there might just be more interrupts to be assigned/routed due to > >> "more > >> clocks being used". If that's true, maybe it's just "luck" that in this > >> case > >> the mpt interrupt gets assigned to lapic0/cpu0 and the box runs fine. I'm > >> just > >> guessing though, since I have no clue how interrupts are assigned to lapics > >> exactly (round-robin? some logic?). > > > > Yes, round-robin, for interrupts that not explicitly bound to specific CPUs. > > The process is deterministic, but hard to predict indeed. > > I see. > > > >>>> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box > >>>> behave > >>>> like the one running current. Indeed, this seems to have changed IRQ58 to > >>>> be routed to lapic0 only. And the box was running for hours without > >>>> showing > >>>> the symptoms. > >>>> > >>>> I just checked boot verbose outpout of my 8-stable box again (booted > >>>> with > >>>> machdep.lapic_allclocks=0 as mentioned above). And now it seems to have > >>>> set > >>>> up IRQ routes just like the current box (one route for IRQ58 to lapic0). > >>> Not sure how to interpret this properly. One possibility is a hardware > >>> problem where interrupt message route between ioapic2 and CPU to which > >>> lapic3 > >>> belongs is flaky. Perhaps, this might be a FreeBSD problem: it could be > >>> that > >>> the system somehow tells to not set up such routes, but we don't listen. > >>> But > >>> this is far fetched. > >> > >> > >> I'm not sure either. If my "theory" above proved to be true, it would have > >> been > >> just luck, that 6.x and 7.x (and current) run just fine on the X4100M2. A > >> (short) test on Ubuntu didn't trigger the problem, so the Linux kernel is > >> either lucky too by selecting an interrupt route that is "not flaky", or > >> there's indeed some way to figure out not to use some lapics for some > >> interrupts. Or we didn't test Linux thoroughly enough. > > > > Yep, it would be interesting to see how interrupts were distributed among > > CPUs on > > that Linux. > > > Well I can't provide this kind of information about _that_ Ubuntu Linux right > now, because it was wiped from the second test machine to test current. But we have a few productive X4100M2 running Debian and there it looks like this: > > ---- > # uname -a > Linux XX 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010 x86_64 GNU/Linux > # cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 0: 36 0 0 1 IO-APIC-edge timer > 1: 0 0 0 2 IO-APIC-edge i8042 > 7: 1 0 0 0 IO-APIC-edge > 8: 0 0 0 1 IO-APIC-edge rtc0 > 9: 0 0 0 0 IO-APIC-fasteoi acpi > 12: 0 0 0 4 IO-APIC-edge i8042 > 14: 0 0 0 74 IO-APIC-edge ide0 > 21: 0 0 0 2 IO-APIC-fasteoi > ehci_hcd:usb2 > 22: 0 0 1 31 IO-APIC-fasteoi > ohci_hcd:usb1 > 56: 52836 302759221 129 50868 IO-APIC-fasteoi eth2 > 57: 288921 1070387307 225 98210 IO-APIC-fasteoi eth3 > 1271: 92146 45282139 9 4885 PCI-MSI-edge ioc0 > NMI: 0 0 0 0 Non-maskable interrupts > LOC: 258132347 312890202 166484456 147070084 Local timer interrupts > RES: 118623017 84540907 100591028 107693244 Rescheduling interrupts > CAL: 108384 89281 110429 104206 function call interrupts > TLB: 14719843 24105630 12456528 18955140 TLB shootdowns > TRM: 0 0 0 0 Thermal event interrupts > THR: 0 0 0 0 Threshold APIC interrupts > SPU: 0 0 0 0 Spurious interrupts > ERR: 1 > ---- > > Not sure how to interpret this. At first sight no IRQ58, but I guess they > might be using MSI for mpt, which might avoid the problem entirely.
Yes, the FreeBSD mpt(4) driver should also use MSI by default unless you have disabled it for some reason. Also, Linux will dynamically reshuffle IRQs among CPUs based on load, so the I/O APIC/MSI -> CPU routing is more dynamic in that case. -- John Baldwin _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"