On Fri, Jan 09, 2009 at 01:57:30PM +0100, Alexander Graf wrote: > Sheng Yang wrote: > > On Friday 09 January 2009 00:36:06 Alexander Graf wrote: > > > >> While booting Linux in VMware ESX, I encountered a strange effect > >> in the in-kernel lapic implementation: time went backwards! > >> > >> While this should never occur, because of that the while loop that > >> is done after the invalid calculations caused my host system to hang. > >> > >> In order to make debugging easier, let's replace this as suggested > >> with a modulo function and not run into the danger of looping forever. > >> > >> To replace the nice hint this bug gave me that the values are broken, > >> I added a printk message so people encountering this can at least > >> see that something is fishy. > >> > >> Of course, the real issue needs to be fixed as well! I'm open to ideas > >> why now < last_update! > >> > >> (Thanks to Kevin for his help in debugging this) > >> > >> Signed-off-by: Alexander Graf <ag...@suse.de> > >> Signed-off-by: Kevin Wolf <kw...@suse.de> > >> --- > >> > > > > Hi Alexander > > > > I'm a little suspect here: > > > > > >> if (unlikely(ktime_to_ns(now) <= > >> ktime_to_ns(apic->timer.last_update))) { > >> /* Wrap around */ > >> passed = ktime_add(( { > >> (ktime_t) { > >> .tv64 = KTIME_MAX - > >> (apic->timer.last_update).tv64}; } > >> ), now); > >> apic_debug("time elapsed\n"); > >> } else > >> passed = ktime_sub(now, apic->timer.last_update); > >> > > > > And now apic timer base is hr_timer with CLOCK_MONOTONIC, and get_time() is > > really ktime_get() which is almost impossible to wrap around. If it's > > overflow, at least we need a warning. I think this piece of code due to > > clock > > source change. > > > > So I doubt: due to some reason, now <= apic->timer.last_update, which cause > > a > > big wrap around operation. > > > > And the most suspect: > > > > > >> void kvm_apic_timer_intr_post(struct kvm_vcpu *vcpu, int vec) > >> { > >> struct kvm_lapic *apic = vcpu->arch.apic; > >> > >> if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec) > >> apic->timer.last_update = ktime_add_ns( > >> apic->timer.last_update, > >> apic->timer.period); > >> } > >> > > > > Not sure what's happening, have you tried this? (In fact, I am little > > willing > > to replace all apic->timer.dev.base->get_time() with more explicit > > ktime_get() > > as in pit.) > > > > Yes, this code is the culprit. Using that patch of yours now is always > > last_update. I'd still like to see that while loop go away ;-).
(I would say all in one mail, but it always comes a little later... :) ) I think at least part of comment for the while loop should be kept... -- regards Yang, Sheng |Intel Opensource Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html