On Fri, Jan 09, 2009 at 01:57:30PM +0100, Alexander Graf wrote:
> Sheng Yang wrote:
> > On Friday 09 January 2009 00:36:06 Alexander Graf wrote:
> >   
> >> While booting Linux in VMware ESX, I encountered a strange effect
> >> in the in-kernel lapic implementation: time went backwards!
> >>
> >> While this should never occur, because of that the while loop that
> >> is done after the invalid calculations caused my host system to hang.
> >>
> >> In order to make debugging easier, let's replace this as suggested
> >> with a modulo function and not run into the danger of looping forever.
> >>
> >> To replace the nice hint this bug gave me that the values are broken,
> >> I added a printk message so people encountering this can at least
> >> see that something is fishy.
> >>
> >> Of course, the real issue needs to be fixed as well! I'm open to ideas
> >> why now < last_update!
> >>
> >> (Thanks to Kevin for his help in debugging this)
> >>
> >> Signed-off-by: Alexander Graf <ag...@suse.de>
> >> Signed-off-by: Kevin Wolf <kw...@suse.de>
> >> ---
> >>     
> >
> > Hi Alexander
> >
> > I'm a little suspect here:
> >
> >   
> >>    if (unlikely(ktime_to_ns(now) <=
> >>            ktime_to_ns(apic->timer.last_update))) {
> >>            /* Wrap around */
> >>            passed = ktime_add(( {
> >>                                (ktime_t) {
> >>                                .tv64 = KTIME_MAX -
> >>                                (apic->timer.last_update).tv64}; }
> >>                               ), now);
> >>            apic_debug("time elapsed\n");
> >>    } else
> >>            passed = ktime_sub(now, apic->timer.last_update);
> >>     
> >
> > And now apic timer base is hr_timer with CLOCK_MONOTONIC, and get_time() is 
> > really ktime_get() which is almost impossible to wrap around. If it's 
> > overflow, at least we need a warning. I think this piece of code due to 
> > clock 
> > source change.
> >
> > So I doubt: due to some reason, now <= apic->timer.last_update, which cause 
> > a 
> > big wrap around operation.
> >
> > And the most suspect:
> >
> >   
> >> void kvm_apic_timer_intr_post(struct kvm_vcpu *vcpu, int vec)
> >> {
> >>    struct kvm_lapic *apic = vcpu->arch.apic;
> >>
> >>    if (apic && apic_lvt_vector(apic, APIC_LVTT) == vec)
> >>            apic->timer.last_update = ktime_add_ns(
> >>                            apic->timer.last_update,
> >>                            apic->timer.period);
> >> }
> >>     
> >
> > Not sure what's happening, have you tried this? (In fact, I am little 
> > willing 
> > to replace all apic->timer.dev.base->get_time() with more explicit 
> > ktime_get() 
> > as in pit.)
> >   
> 
> Yes, this code is the culprit. Using that patch of yours now is always >
> last_update. I'd still like to see that while loop go away ;-).

(I would say all in one mail, but it always comes a little later... :) )

I think at least part of comment for the while loop should be kept...

-- 
regards
Yang, Sheng     |Intel Opensource Technology Center
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to