On 04/04/2014 03:21 PM, Koen Vanoppen wrote:
So... It is possible for a fully automatic migration of the VM to
another hypervisor in case Storage connection fails?
How can we make this happen? Because for the moment, when we tested the
situation they stayed in pause state.
(Test situation:

  * Unplug the 2 fibre cables from the hypervisor
  * VM's go in pause state
  * VM's stayed in pause state until the failure was solved

)

the KVM team advised this would be an unsafe migration. iirc, since IO can be stuck at kernel level, pending write to the storage, which would cause corruption if storage is recovered while the VM is now running on another machine.



They only returned when we restored the fiber connection to the
Hypervisor...

Kind Regards,

Koen



2014-04-04 13:52 GMT+02:00 Koen Vanoppen <[email protected]
<mailto:[email protected]>>:

    So... It is possible for a fully automatic migration of the VM to
    another hypervisor in case Storage connection fails?
    How can we make this happen? Because for the moment, when we tested
    the situation they stayed in pause state.
    (Test situation:

      * Unplug the 2 fibre cables from the hypervisor
      * VM's go in pause state
      * VM's stayed in pause state until the failure was solved

    )


    They only returned when we restored the fiber connection to the
    Hypervisor...

    Kind Regards,

    Koen


    2014-04-03 16:53 GMT+02:00 Koen Vanoppen <[email protected]
    <mailto:[email protected]>>:

        ---------- Forwarded message ----------
        From: "Doron Fediuck" <[email protected]
        <mailto:[email protected]>>
        Date: Apr 3, 2014 4:51 PM
        Subject: Re: [Users] HA
        To: "Koen Vanoppen" <[email protected]
        <mailto:[email protected]>>
        Cc: "Omer Frenkel" <[email protected]
        <mailto:[email protected]>>, <[email protected]
        <mailto:[email protected]>>, "Federico Simoncelli"
        <[email protected] <mailto:[email protected]>>, "Allon
        Mureinik" <[email protected] <mailto:[email protected]>>



        ----- Original Message -----
         > From: "Koen Vanoppen" <[email protected]
        <mailto:[email protected]>>
         > To: "Omer Frenkel" <[email protected]
        <mailto:[email protected]>>, [email protected]
        <mailto:[email protected]>
         > Sent: Wednesday, April 2, 2014 4:17:36 PM
         > Subject: Re: [Users] HA
         >
         > Yes, indeed. I meant not-operational. Sorry.
         > So, if I understand this correctly. When we ever come in a
        situation that we
         > loose both storage connections on our hypervisor, we will
        have to manually
         > restore the connections first?
         >
         > And thanx for the tip for speeding up thins :-).
         >
         > Kind regards,
         >
         > Koen
         >
         >
         > 2014-04-02 15:14 GMT+02:00 Omer Frenkel < [email protected]
        <mailto:[email protected]> > :
         >
         >
         >
         >
         >
         > ----- Original Message -----
         > > From: "Koen Vanoppen" < [email protected]
        <mailto:[email protected]> >
         > > To: [email protected] <mailto:[email protected]>
         > > Sent: Wednesday, April 2, 2014 4:07:19 PM
         > > Subject: [Users] HA
         > >
         > > Dear All,
         > >
         > > Due our acceptance testing, we discovered something.
        (Document will
         > > follow).
         > > When we disable one fiber path, no problem multipath finds
        it way no pings
         > > are lost.
         > > BUT when we disabled both the fiber paths (so one of the
        storage domain is
         > > gone on this host, but still available on the other host),
        vms go in paused
         > > mode... He chooses a new SPM (can we speed this up?), put's
        the host in
         > > non-responsive (can we speed this up, more important) and
        the VM's stay on
         > > Paused mode... I would expect that they would be migrated
        (yes, HA is
         >
         > i guess you mean the host moves to not-operational (in
        contrast to
         > non-responsive)?
         > if so, the engine will not migrate vms that are paused to do
        io error,
         > because of data corruption risk.
         >
         > to speed up you can look at the storage domain monitoring
        timeout:
         > engine-config --get StorageDomainFalureTimeoutInMinutes
         >
         >
         > > enabled) to the other host and reboot there... Any
        solution? We are still
         > > using oVirt 3.3.1 , but we are planning a upgrade to 3.4
        after the easter
         > > holiday.
         > >
         > > Kind Regards,
         > >
         > > Koen
         > >

        Hi Koen,
        Resuming from paused due to io issues is supported (adding
        relevant folks).
        Regardless, if you did not define power management, you should
        manually approve
        source host was rebooted in order for migration to proceed.
        Otherwise we risk
        split-brain scenario.

        Doron





_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users


_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to