When OpenNebula creates a checkpoint file either as part of a onevm migrate or onevm suspend, what libvirt function is it calling to do the checkpoint?
We are seeing some issues on our new Ivy Bridge hardware that sometimes in the process of a (non-live) migration, the clock can get confused in such a way that when the virtual machine starts from the checkpoint file it will be hung and the kvm process uses 100% of cpu for a day or more, and then usually resolves itself. In some cases we see the clock jump very far into the future (2598), which in itself can confuse a linux vm enough to hang it. Any clues on what OpenNebula /libvirt are doing under the covers? Is there any reason to suspect that on Ivy Bridge hardware, in which there are some 60 different cpu frequencies available for cpu scaling, the rapidly fluctuating clock speeds might get us into trouble--i.e. suspending the machine on one clock frequency and bringig it back on a different clock frequency? Does anyone have experience in migrating between hardware generations... Ivy Bridge -> Westmere and vice versa? Finally, has anyone run a successful combination of kernel 3.10 or greater and RHEL6/Centos 6/Sci. Linux 6? (In particular do the stock versions of libvirt and qemu-kvm play nice with the 3.10 kernel)? The 2.6.32 kernel that comes with RHEL6/Centos6/Sci Linux 6 is just not up to dealing with virtualization on Ivy Bridge machines and it has some trouble on Sandy Bridge too. Thanks Steve Timm ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 [email protected] http://home.fnal.gov/~timm/ Fermilab Scientific Computing Division, Scientific Computing Services Quad. Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing _______________________________________________ Users mailing list [email protected] http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
