Il Wed, Jun 13, 2007 at 11:59:25AM +0300, Avi Kivity ha scritto: > Luca Tettamanti wrote: > >Il Mon, Jun 11, 2007 at 10:44:45AM +0300, Avi Kivity ha scritto: > > > >>Luca wrote: > >> > >>>>I've managed to reproduce this on kvm-21 (it takes many boots for this > >>>>to happen, but it does eventually). > >>>> > >>>Hum, any clue on the cause? > >>> > >>From what I've seen, it's the new Linux clocksource code. > >> > >> > >>>Should I test older versions? > >>> > >>They're unlikely to be better. Instead, it would be best to see what > >>the guest is doing. > >> > > > >RCU is not working. Network initialization hangs because it happens to > >be the first RCU user. > >The guest is stuck waiting for RCU syncronization: > > > >[ 4.992207] [<c04321c5>] synchronize_rcu+0x4e/0x80 > >[ 4.994379] [<c0431db5>] wakeme_after_rcu+0x0/0x8 > >[ 4.996521] [<c0599ad1>] synchronize_net+0x64/0x8c > >[ 4.998678] [<c05d70e0>] inet_register_protosw+0xef/0x151 > >[ 5.000984] [<c072d79e>] inet_init+0x1cd/0x498 > > > >wait_for_completion() in synchronize_rcu() calls schedule() and the > >completion is never signaled (wakeme_after_rcu is never called). > >The completion AFAICS would be signaled via rcu_process_callbacks(), > >which is called in tasklet context. > >Scheduler and completion are working fine since they're used in other > >part of the kernel without problems. > > > >To recap: > > > >i686 F7 kernel: always works. > > > >i586 F7 kernel: sometime hangs due to RCU problems. When it does work > >it's because the LAPIC is disabled on boot: > > > >Using local APIC timer interrupts. > >calibrating APIC timer ... > >... lapic delta = 25745109 > >... PM timer delta = 0 > >..... delta 25745109 > >..... mult: 1105912110 > >..... calibration result: 4119217 > >..... CPU clock speed is 8794.0417 MHz. > >..... host bus clock speed is 4119.0217 MHz. > >... verify APIC timer > >... jiffies delta = 103 > >APIC timer disabled due to verification failure. > > > >When it doesn't work LAPIC passes the test: > > > >[ 1.304717] Using local APIC timer interrupts. > >[ 1.304719] calibrating APIC timer ... > >[ 1.718823] ... lapic delta = 25251444 > >[ 1.720582] ... PM timer delta = 0 > >[ 1.722219] ..... delta 25251444 > >[ 1.723827] ..... mult: 1084706136 > >[ 1.725470] ..... calibration result: 4040231 > >[ 1.727374] ..... CPU clock speed is 8625.0780 MHz. > >[ 1.729396] ..... host bus clock speed is 4040.0231 MHz. > >[ 1.731540] ... verify APIC timer > >[ 2.158342] ... jiffies delta = 102 > >[ 2.160035] ... jiffies result ok > > > >i586 F7 kernel, with 'nolapic': always works. > > > > Can you check which .config option causes it (a special type of > bisecting...)? > > This looks likely based on your findings: > > -CONFIG_X86_ALIGNMENT_16=y > +CONFIG_X86_GOOD_APIC=y > CONFIG_X86_INTEL_USERCOPY=y > +CONFIG_X86_USE_PPRO_CHECKSUM=y > +CONFIG_X86_TSC=y > > I expect it's not directly related to i586 vs i686.
And the winner is... CONFIG_X86_GOOD_APIC ;-) I think that !GOOD_APIC is a workaround for "11AP erratum in Pentium Processor Specification Update" aka read-before-write bug. The config symbol is used in include/asm-i386/apic.h: #ifdef CONFIG_X86_GOOD_APIC # define FORCE_READ_AROUND_WRITE 0 # define apic_read_around(x) # define apic_write_around(x,y) apic_write((x),(y)) #else # define FORCE_READ_AROUND_WRITE 1 # define apic_read_around(x) apic_read(x) # define apic_write_around(x,y) apic_write_atomic((x),(y)) #endif With GOOD_APIC apic_read_around is a nop, while apic_write_around is a normal write. With !GOOD_APIC apic_write_around writes to the APIC reg using xchg. With !GOOD_APIC and this patch: --- include/asm-i386/apic.h~ 2007-04-26 05:08:32.000000000 +0200 +++ include/asm-i386/apic.h 2007-06-13 22:35:00.000000000 +0200 @@ -56,7 +56,8 @@ static __inline fastcall void native_apic_write_atomic(unsigned long reg, unsigned long v) { - xchg((volatile unsigned long *)(APIC_BASE+reg), v); +// xchg((volatile unsigned long *)(APIC_BASE+reg), v); + *((volatile unsigned long *)(APIC_BASE+reg)) = v; } static __inline fastcall unsigned long native_apic_read(unsigned long reg) The kernel boots fine. Luca -- "Sei l'unica donna della mia vita". (Adamo) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/