On Wed, Feb 24, 2016 at 01:31:05PM +0100, Juan Quintela wrote: > David Gibson <da...@gibson.dropbear.id.au> wrote: > > On Tue, Feb 23, 2016 at 09:27:09PM +0000, Mark Cave-Ayland wrote: > >> On 03/02/16 04:59, David Gibson wrote: > >> > >> >> Going back to your earlier email you suggested that the host timebase is > >> >> always continuously running, even when the guest is paused. But then > >> >> resuming the guest then the timebase must jump in the guest regardless? > >> >> > >> >> If this is the case then this is the big difference between TCG and KVM > >> >> guests: TCG timebase is derived from the virtual clock which solves the > >> >> problem of paused guests during migration. For example with the existing > >> >> migration code, what would happen if you did a migration with the guest > >> >> paused on KVM? The offset would surely be wrong as it was calculated at > >> >> the end of migration. > >> > > >> > So there are two different cases to consider here. Once is when the > >> > guest is paused incidentally, such as during migration, the other is > >> > when the guest is explicitly paused. > >> > > >> > In the first case the timebase absolutely should keep running (or > >> > appear to do so), since it's the primary source of real time for the > >> > guest. > >> > >> I'm not sure I understand this, since if the guest is paused either > >> deliberately or incidentally during migration then isn't the timebase > >> also frozen? Or is it external to the CPU? > > > > I don't really understand the question. Migration has no equivalent > > in real hardware, so there's no "real" behaviour to mimic. If we > > freeze the TB during migration, then the guest's clock will get out of > > sync with wall clock time, and in a production environment that's > > really bad. So no, we absolutely must not freeze the TB during > > migration. > > > > When the guest has been explicitly paused, there's a case to be made > > either way. > > If this is the case, can't we just change the device to just read the > clock from the host at device insntantiation and call it a day?
That's not quite enough, because although the timebase advances in real time, it will have an offset from realtime that varies boot to boot. > (* Notice that I haven't seen the previous discussion *) > > On migration, having a post-load function that just loads the right > value for that device should work. Or if we want to make it work for > pause/cont, we should have a notifier to be run each time "cont" is > issued, and put a callback there? Right. This is basically what we already do on pseries: in pre_save we store both the timebase value and the current real time. In post_load we again check the real time, look at the difference from the value in the migration stream and advance the TB to match. > Or I am missing something improtant? > > > > >> > In the second case, it's a bit unclear what the right thing to do is. > >> > Keeping the tb running means accurate realtime, but stopping it is > >> > often better for debugging, which is one of the main reasons to > >> > explicitly pause. > >> > > >> > I believe spapr on KVM HV will keep the TB going, but the TSC on x86 > >> > will be stopped. > >> > >> Is this from a guest-centric view, i.e. if I pause a VM and wait 20 mins > >> then when the guest resumes the timebase will jump forward by 20 mins > >> worth of ticks? > > > > Yes, that's correct. > > I.e. my proposal fixes this? > If you want ot make it really, really "classy", you can look at the mess > we have on x86 to introduce ticks "slewly" for windos 95 (and XP?) > during a while, but I don't think that solution would work for 20mins of > ticks :p > > Later, Juan. > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature