Re: [Qemu-devel] Migrating decrementer

David Gibson Wed, 24 Feb 2016 16:19:27 -0800

On Wed, Feb 24, 2016 at 01:31:05PM +0100, Juan Quintela wrote:
> David Gibson <da...@gibson.dropbear.id.au> wrote:
> > On Tue, Feb 23, 2016 at 09:27:09PM +0000, Mark Cave-Ayland wrote:
> >> On 03/02/16 04:59, David Gibson wrote:
> >> 
> >> >> Going back to your earlier email you suggested that the host timebase is
> >> >> always continuously running, even when the guest is paused. But then
> >> >> resuming the guest then the timebase must jump in the guest regardless?
> >> >>
> >> >> If this is the case then this is the big difference between TCG and KVM
> >> >> guests: TCG timebase is derived from the virtual clock which solves the
> >> >> problem of paused guests during migration. For example with the existing
> >> >> migration code, what would happen if you did a migration with the guest
> >> >> paused on KVM? The offset would surely be wrong as it was calculated at
> >> >> the end of migration.
> >> > 
> >> > So there are two different cases to consider here.  Once is when the
> >> > guest is paused incidentally, such as during migration, the other is
> >> > when the guest is explicitly paused.
> >> > 
> >> > In the first case the timebase absolutely should keep running (or
> >> > appear to do so), since it's the primary source of real time for the
> >> > guest.
> >> 
> >> I'm not sure I understand this, since if the guest is paused either
> >> deliberately or incidentally during migration then isn't the timebase
> >> also frozen? Or is it external to the CPU?
> >
> > I don't really understand the question.  Migration has no equivalent
> > in real hardware, so there's no "real" behaviour to mimic.  If we
> > freeze the TB during migration, then the guest's clock will get out of
> > sync with wall clock time, and in a production environment that's
> > really bad.  So no, we absolutely must not freeze the TB during
> > migration.
> >
> > When the guest has been explicitly paused, there's a case to be made
> > either way.
> 
> If this is the case, can't we just change the device to just read the
> clock from the host at device insntantiation and call it a day?


That's not quite enough, because although the timebase advances in
real time, it will have an offset from realtime that varies boot to
boot.

> (* Notice that I haven't seen the previous discussion *)
> 
> On migration, having a post-load function that just loads the right
> value for that device should work.  Or if we want to make it work for
> pause/cont, we should have a notifier to be run each time "cont" is
> issued, and put a callback there?

Right.  This is basically what we already do on pseries: in pre_save
we store both the timebase value and the current real time.  In
post_load we again check the real time, look at the difference from
the value in the migration stream and advance the TB to match.

> Or I am missing something improtant?
> 
> >
> >> > In the second case, it's a bit unclear what the right thing to do is.
> >> > Keeping the tb running means accurate realtime, but stopping it is
> >> > often better for debugging, which is one of the main reasons to
> >> > explicitly pause.
> >> > 
> >> > I believe spapr on KVM HV will keep the TB going, but the TSC on x86
> >> > will be stopped.
> >> 
> >> Is this from a guest-centric view, i.e. if I pause a VM and wait 20 mins
> >> then when the guest resumes the timebase will jump forward by 20 mins
> >> worth of ticks?
> >
> > Yes, that's correct.
> 
> I.e. my proposal fixes this?
> If you want ot make it really, really "classy", you can look at the mess
> we have on x86 to introduce ticks "slewly" for windos 95 (and XP?)
> during a while, but I don't think that solution would work for 20mins of
> ticks :p
> 
> Later, Juan.
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] Migrating decrementer

Reply via email to