Hi,

Thanks a lot for your answer.

[...]
* Why has the VM to be recreated? The disk image lies on a shared
storage (RBD) and should only be started on another host, not
recreated.

Any other process will try to contact the failing host so the only
possible path is to recreate the VM. Note that this operations are
agnostic from the underlying infrastructure, so it should work on RBD
or a simple storage shared through SSH cp's.

Given said that, It seems that we need to modify the ceph Datastore to
check if the volume exist before trying to create a new one, so the
use case is fully supported.

http://dev.opennebula.org/issues/2324 [2]

Yes, that's a good idea. ONE should check this... If you can tell me where I need to look for this in the source, maybe I would be able to contribute to this.

* The VM now has the state "FAILED". How is the VM supposed to be
recovered?

You can try delete --recreate.

I did a new test: Disabled the default FT hook, powered off one of the KVM hosts with one VM running. This VM then was in the UNKNOWN state. In this state we only can issue "onevm boot 66" which tries to start this VM on the failed node, or the "onevm delete 66 --recreate" which would delete the VM and recreate it (But: "rbd image one-5-66-0 already exists"). Both commands are not able to start the same VM again on another host in the cluster and I even can't find any other command to get this VM up and running again. The best thing would be to just reschedule the VM so that the VM can be started on one of the remaining hosts. Of course ticket #2324 needs to be done so that the existing volume on the shared storage (in my case RBD) will be used. What do you think on this?

Cheers,
Tobias

_______________________________________________
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

Reply via email to