On 29.11.18 16:32, Thomas Huth wrote: > On 2018-11-28 10:59, David Hildenbrand wrote: >> Just like on other architectures, we should stop the clock while the guest >> is not running. This is already properly done for TCG. Right now, doing an >> offline migration (stop, migrate, cont) can easily trigger stalls in the >> guest. >> >> Even doing a >> (hmp) stop >> ... wait 2 minutes ... >> (hmp) cont >> will already trigger stalls. >> >> So whenever the guest stops, backup the KVM TOD. When continuing to run >> the guest, restore the KVM TOD. >> >> One special case is starting a simple VM: Reading the TOD from KVM to >> stop it right away until the guest is actually started means that the >> time of any simple VM will already differ to the host time. We can >> simply leave the TOD running and the guest won't be able to recognize >> it. >> >> For migration, we actually want to keep the TOD stopped until really >> starting the guest. To be able to catch most errors, we should however >> try to set the TOD in addition to simply storing it. So we can still >> catch basic migration problems. >> >> If anything goes wrong while backing up/restoring the TOD, we have to >> ignore it (but print a warning). This is then basically a fallback to >> old behavior (TOD remains running). >> >> I tested this very basically with an initrd: >> 1. Start a simple VM. Observed that the TOD is kept running. Old >> behavior. >> 2. Ordinary live migration. Observed that the TOD is temporarily >> stopped on the destination when setting the new value and >> correctly started when finally starting the guest. >> 3. Offline live migration. (stop, migrate, cont). Observed that the >> TOD will be stopped on the source with the "stop" command. On the >> destination, the TOD is temporarily stopped when setting the new >> value and correctly started when finally starting the guest via >> "cont". >> 4. Simple stop/cont correctly stops/starts the TOD. (multiple stops >> or conts in a row have no effect, so works as expected) >> >> In the future, we might want to send the guest a special kind of time sync >> interrupt under some conditions, so it can synchronize its tod to the >> host tod. This is interesting for migration scenarios but also when we >> get time sync interrupts ourselves. This however will most probably have >> to be handled in KVM (e.g. when the tods differ too much) and is not >> desired e.g. when debugging the guest. (single stepping should not >> result in permanent time syncs). I consider something like that an add-on >> on top of this basic "don't break the guest" handling. >> >> Signed-off-by: David Hildenbrand <da...@redhat.com> >> --- >> >> v1 -> v2: >> - Add time sync idea suggested by Christian to description >> - Drop one unnecessary "return" >> - Register in realize() and not in instance_init() >> >> >> hw/s390x/tod-kvm.c | 91 +++++++++++++++++++++++++++++++++++++++++- >> hw/s390x/tod.c | 5 +++ >> include/hw/s390x/tod.h | 8 +++- >> 3 files changed, 101 insertions(+), 3 deletions(-) >> >> diff --git a/hw/s390x/tod-kvm.c b/hw/s390x/tod-kvm.c >> index df564ab89c..87717d0be1 100644 >> --- a/hw/s390x/tod-kvm.c >> +++ b/hw/s390x/tod-kvm.c >> @@ -10,10 +10,11 @@ >> >> #include "qemu/osdep.h" >> #include "qapi/error.h" >> +#include "sysemu/sysemu.h" >> #include "hw/s390x/tod.h" >> #include "kvm_s390x.h" >> >> -static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error >> **errp) >> +static void kvm_s390_get_tod_raw(S390TOD *tod, Error **errp) >> { >> int r; >> >> @@ -27,7 +28,17 @@ static void kvm_s390_tod_get(const S390TODState *td, >> S390TOD *tod, Error **errp) >> } >> } >> >> -static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error >> **errp) >> +static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error >> **errp) >> +{ >> + if (td->stopped) { >> + *tod = td->base; >> + return; >> + } >> + >> + kvm_s390_get_tod_raw(tod, errp); >> +} >> + >> +static void kvm_s390_set_tod_raw(const S390TOD *tod, Error **errp) >> { >> int r; >> >> @@ -41,18 +52,94 @@ static void kvm_s390_tod_set(S390TODState *td, const >> S390TOD *tod, Error **errp) >> } >> } >> >> +static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error >> **errp) >> +{ >> + Error *local_err = NULL; >> + >> + /* >> + * Somebody (e.g. migration) set the TOD. We'll store it into KVM to >> + * properly detect errors now but take a look at the runstate to decide >> + * whether really to keep the tod running. E.g. during migration, this >> + * is the point where we want to stop the initially running TOD to fire >> + * it back up when actually starting the migrated guest. >> + */ >> + kvm_s390_set_tod_raw(tod, &local_err); >> + if (local_err) { >> + error_propagate(errp, local_err); >> + return; >> + } >> + >> + if (runstate_is_running()) { >> + td->stopped = false; >> + } else { >> + td->stopped = true; >> + td->base = *tod; >> + } >> +} >> + >> +static void kvm_s390_tod_vm_state_change(void *opaque, int running, >> + RunState state) >> +{ >> + S390TODState *td = opaque; >> + Error *local_err = NULL; >> + >> + if (running && td->stopped) { >> + /* Set the old TOD when running the VM - start the TOD clock. */ >> + kvm_s390_set_tod_raw(&td->base, &local_err); >> + if (local_err) { >> + warn_report_err(local_err); >> + } >> + /* Treat errors like the TOD was running all the time. */ >> + td->stopped = false; >> + } else if (!running && !td->stopped) { >> + /* Store the TOD when stopping the VM - stop the TOD clock. */ >> + kvm_s390_get_tod_raw(&td->base, &local_err); >> + if (local_err) { >> + /* Keep the TOD running in case we could not back it up. */ >> + warn_report_err(local_err); >> + } else { >> + td->stopped = true; >> + } >> + } >> +} >> + >> +static void kvm_s390_tod_realize(S390TODState *td, Error **errp) >> +{ >> + /* >> + * We need to know when the VM gets started/stopped to start/stop the >> TOD. >> + * As we can never have more than one TOD instance (and that will never >> be >> + * removed), registering here and never unregistering is good enough. >> + */ >> + qemu_add_vm_change_state_handler(kvm_s390_tod_vm_state_change, td); >> +} >> + >> static void kvm_s390_tod_class_init(ObjectClass *oc, void *data) >> { >> S390TODClass *tdc = S390_TOD_CLASS(oc); >> >> + tdc->realize = kvm_s390_tod_realize; >> tdc->get = kvm_s390_tod_get; >> tdc->set = kvm_s390_tod_set; >> } >> >> +static void kvm_s390_tod_init(Object *obj) >> +{ >> + S390TODState *td = S390_TOD(obj); >> + >> + /* >> + * The TOD is initially running (value stored in KVM). Avoid needless >> + * loading/storing of the TOD when starting a simple VM, so let it >> + * run although the (never started) VM is stopped. For migration, we >> + * will properly set the TOD later. >> + */ >> + td->stopped = false; >> +} >> + >> static TypeInfo kvm_s390_tod_info = { >> .name = TYPE_KVM_S390_TOD, >> .parent = TYPE_S390_TOD, >> .instance_size = sizeof(S390TODState), >> + .instance_init = kvm_s390_tod_init, >> .class_init = kvm_s390_tod_class_init, >> .class_size = sizeof(S390TODClass), >> }; >> diff --git a/hw/s390x/tod.c b/hw/s390x/tod.c >> index 1c63f411e6..82ea6554ba 100644 >> --- a/hw/s390x/tod.c >> +++ b/hw/s390x/tod.c >> @@ -97,9 +97,14 @@ static SaveVMHandlers savevm_tod = { >> static void s390_tod_realize(DeviceState *dev, Error **errp) >> { >> S390TODState *td = S390_TOD(dev); >> + S390TODClass *tdc = S390_TOD_GET_CLASS(td); >> >> /* Legacy migration interface */ >> register_savevm_live(NULL, "todclock", 0, 1, &savevm_tod, td); >> + >> + if (tdc->realize) { >> + tdc->realize(td, errp); >> + } >> } > > I think the more usual way to deal with this, is to use > device_class_set_parent_realize() in the child's class init function, > and then to call ->parent_realize() from the child's realize function. > > I guess it doesn't really matter in the end which way we do it here, but > it would be nice to be consistent with the other devices, so I'd prefer > to use device_class_set_parent_realize() for this here, too. > > Apart from that, the patch looks fine to me.
Alright, I will resend tomorrow with your suggested change. Thanks! > > Thomas > -- Thanks, David / dhildenb