Should I understand then that fecing is not mandatory, just advised and there are no downtimes to the rest of the cluster if a pending manual action has to be done to confirm a host has been rebooted in order to move the SPM to another live host. Perhaps only effect is not be able to add new hosts and connect to storage or something ?

Fernando

On 17/04/2017 10:06, Nir Soffer wrote:
On Mon, Apr 17, 2017 at 8:24 AM Konstantin Raskoshnyi <[email protected] <mailto:[email protected]>> wrote:

    But actually, it didn't work well. After main SPM host went down I
    see this
    Screen Shot 2017-04-16 at 10.22.00 PM.png

    2017-04-17 05:23:15,554Z ERROR
    [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
    (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1]
    SPM Init: could not find reported vds or not up - pool: 'STG'
    vds_spm_id: '1'
    2017-04-17 05:23:15,567Z INFO
     [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
    (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1]
    SPM selection - vds seems as spm 'tank5'
    2017-04-17 05:23:15,567Z WARN
     [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
    (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1]
    spm vds is non responsive, stopping spm selection.

    So that means only if BMC is up it's possible to automatically
    switch  SPM host?


BMC?

If your SPM is no responsive, the system will try to fence it. Did you
configure power management for all hosts? did you check that it
work? How did you simulate non-responsive host?

If power management is not configured or fail, the system cannot
move the spm to another host, unless you manually confirm that the
SPM host was rebooted.

Nir


    Thanks

    On Sun, Apr 16, 2017 at 8:29 PM, Konstantin Raskoshnyi
    <[email protected] <mailto:[email protected]>> wrote:

        Oh, fence agent works fine if I select ilo4,
        Thank you for your help!

        On Sun, Apr 16, 2017 at 8:22 PM Dan Yasny <[email protected]
        <mailto:[email protected]>> wrote:

            On Sun, Apr 16, 2017 at 11:19 PM, Konstantin Raskoshnyi
            <[email protected] <mailto:[email protected]>> wrote:

                Makes sense.
                I was trying to set it up, but doesn't work with our
                staging hardware.
                We have old ilo100, I'll try again.
                Thanks!


            It is absolutely necessary for any HA to work properly.
            There's of course the "confirm host has been shutdown"
            option, which serves as an override for the fence command,
            but it's manual

                On Sun, Apr 16, 2017 at 8:18 PM Dan Yasny
                <[email protected] <mailto:[email protected]>> wrote:

                    On Sun, Apr 16, 2017 at 11:15 PM, Konstantin
                    Raskoshnyi <[email protected]
                    <mailto:[email protected]>> wrote:

                        Fence agent under each node?


                    When you configure a host, there's the power
                    management tab, where you need to enter the bmc
                    details for the host. If you don't have fencing
                    enabled, how do you expect the system to make sure
                    a host running a service is actually down (and it
                    is safe to start HA services elsewhere), and not,
                    for example, just unreachable by the engine? How
                    do you avoid a splitbraid -> SBA ?


                        On Sun, Apr 16, 2017 at 8:14 PM Dan Yasny
                        <[email protected] <mailto:[email protected]>>
                        wrote:

                            On Sun, Apr 16, 2017 at 11:13 PM,
                            Konstantin Raskoshnyi <[email protected]
                            <mailto:[email protected]>> wrote:

                                "Corner cases"?
                                I tried to simulate crash of SPM
                                server and ovirt kept trying to
                                reistablished connection to the failed
                                node.


                            Did you configure fencing?



                                On Sun, Apr 16, 2017 at 8:10 PM Dan
                                Yasny <[email protected]
                                <mailto:[email protected]>> wrote:

                                    On Sun, Apr 16, 2017 at 7:29 AM,
                                    Nir Soffer <[email protected]
                                    <mailto:[email protected]>> wrote:

                                        On Sun, Apr 16, 2017 at 2:05
                                        PM Dan Yasny
                                        <[email protected]
                                        <mailto:[email protected]>> wrote:



                                            On Apr 16, 2017 7:01 AM,
                                            "Nir Soffer"
                                            <[email protected]
                                            <mailto:[email protected]>>
                                            wrote:

                                                On Sun, Apr 16, 2017
                                                at 4:17 AM Dan Yasny
                                                <[email protected]
                                                <mailto:[email protected]>>
                                                wrote:

                                                    When you set up a
                                                    storage domain,
                                                    you need to
                                                    specify a host to
                                                    perform the
                                                    initial storage
                                                    operations, but
                                                    once the SD is
                                                    defined, it's
                                                    details are in the
                                                    engine database,
                                                    and all the hosts
                                                    get connected to
                                                    it directly. If
                                                    the first host you
                                                    used to define the
                                                    SD goes down, all
                                                    other hosts will
                                                    still remain
                                                    connected and
                                                    work. SPM is an HA
                                                    service, and if
                                                    the current SPM
                                                    host goes down,
                                                    SPM gets started
                                                    on another host in
                                                    the DC. In short,
                                                    unless your actual
                                                    NFS exporting host
                                                    goes down, there
                                                    is no outage.


                                                There is no storage
                                                outage, but if you
                                                shutdown the spm host,
                                                the spm host
                                                will not move to a new
                                                host until the spm
                                                host is online again,
                                                or you confirm
                                                manually that the spm
                                                host was rebooted.


                                            In a properly configured
                                            setup the SBA should take
                                            care of that. That's the
                                            whole point of HA services


                                        In some cases like power loss
                                        or hardware failure, there is
                                        no way to start
                                        the spm host, and the system
                                        cannot recover automatically.


                                    There are always corner cases, no
                                    doubt. But in a normal situation.
                                    where an SPM host goes down
                                    because of a hardware failure, it
                                    gets fenced, other hosts contend
                                    for SPM and start it. No surprises
                                    there.


                                        Nir



                                                Nir


                                                    On Sat, Apr 15,
                                                    2017 at 1:53 PM,
                                                    Konstantin
                                                    Raskoshnyi
                                                    <[email protected]
                                                    <mailto:[email protected]>>
                                                    wrote:

                                                        Hi Fernando,
                                                        I see each
                                                        host has
                                                        direct
                                                        connection nfs
                                                        mount, but
                                                        yes, if main
                                                        host to which
                                                        I connected
                                                        nfs storage
                                                        going down the
                                                        storage
                                                        becomes
                                                        unavailable
                                                        and all vms
                                                        are down


                                                        On Sat, Apr
                                                        15, 2017 at
                                                        10:37 AM
                                                        FERNANDO
                                                        FREDIANI
                                                        
<[email protected]
                                                        
<mailto:[email protected]>>
                                                        wrote:

                                                            Hello
                                                            Konstantin.

                                                            That
                                                            doesn`t
                                                            make much
                                                            sense make
                                                            a whole
                                                            cluster
                                                            depend on
                                                            a single
                                                            host. From
                                                            what I
                                                            know any
                                                            host talk
                                                            directly
                                                            to NFS
                                                            Storage
                                                            Array or
                                                            whatever
                                                            other
                                                            Shared
                                                            Storage
                                                            you have.
                                                            Have you
                                                            tested
                                                            that host
                                                            going down
                                                            if that
                                                            affects
                                                            the other
                                                            with the
                                                            NFS
                                                            mounted
                                                            directlly
                                                            in a NFS
                                                            Storage
                                                            array ?

                                                            Fernando

                                                            2017-04-15
                                                            12:42
                                                            GMT-03:00
                                                            Konstantin
                                                            Raskoshnyi
                                                            <[email protected]
                                                            
<mailto:[email protected]>>:

                                                                In
                                                                ovirt
                                                                you
                                                                have
                                                                to
                                                                attach
                                                                storage
                                                                through
                                                                specific
                                                                host.
                                                                If
                                                                host
                                                                goes
                                                                down
                                                                storage
                                                                is not
                                                                available.


                                                                On
                                                                Sat,
                                                                Apr
                                                                15,
                                                                2017
                                                                at
                                                                7:31
                                                                AM
                                                                FERNANDO
                                                                FREDIANI
                                                                
<[email protected]
                                                                
<mailto:[email protected]>>
                                                                wrote:

                                                                    Well,
                                                                    make
                                                                    it
                                                                    not
                                                                    go
                                                                    through
                                                                    host1
                                                                    and
                                                                    dedicate
                                                                    a
                                                                    storage
                                                                    server
                                                                    for
                                                                    running
                                                                    NFS
                                                                    and
                                                                    make
                                                                    both
                                                                    hosts
                                                                    connect
                                                                    to it.
                                                                    In
                                                                    my
                                                                    view
                                                                    NFS
                                                                    is
                                                                    much
                                                                    easier
                                                                    to
                                                                    manage
                                                                    than
                                                                    any
                                                                    other
                                                                    type
                                                                    of
                                                                    storage,
                                                                    specially
                                                                    FC
                                                                    and
                                                                    iSCSI
                                                                    and
                                                                    performance
                                                                    is
                                                                    pretty
                                                                    much
                                                                    the
                                                                    same,
                                                                    so
                                                                    you
                                                                    won`t
                                                                    get
                                                                    better
                                                                    results
                                                                    other
                                                                    than
                                                                    management
                                                                    going
                                                                    to
                                                                    other
                                                                    type.

                                                                    Fernando

                                                                    2017-04-15
                                                                    5:25
                                                                    GMT-03:00
                                                                    Konstantin
                                                                    Raskoshnyi
                                                                    
<[email protected]
                                                                    
<mailto:[email protected]>>:

                                                                        Hi
                                                                        guys,

                                                                        I
                                                                        have
                                                                        one
                                                                        nfs 
storage,
                                                                        it's
                                                                        
connected
                                                                        through
                                                                        host1.
                                                                        host2
                                                                        also
                                                                        has
                                                                        access
                                                                        to
                                                                        it,
                                                                        I
                                                                        can
                                                                        easily
                                                                        migrate
                                                                        vms 
between
                                                                        them.

                                                                        The
                                                                        question
                                                                        is
                                                                        -
                                                                        if
                                                                        host1
                                                                        is
                                                                        down
                                                                        -
                                                                        all
                                                                        
infrastructure
                                                                        is
                                                                        down,
                                                                        since
                                                                        all
                                                                        traffic
                                                                        goes
                                                                        through
                                                                        host1,
                                                                        is
                                                                        there
                                                                        any
                                                                        way
                                                                        in
                                                                        oVirt
                                                                        to
                                                                        use
                                                                        
redundant
                                                                        storage?

                                                                        Only
                                                                        
glusterfs?

                                                                        Thanks


                                                                        
_______________________________________________
                                                                        Users
                                                                        mailing
                                                                        list
                                                                        
[email protected]
                                                                        
<mailto:[email protected]>
                                                                        
http://lists.ovirt.org/mailman/listinfo/users



                                                        
_______________________________________________
                                                        Users mailing list
                                                        [email protected]
                                                        <mailto:[email protected]>
                                                        
http://lists.ovirt.org/mailman/listinfo/users


                                                    
_______________________________________________
                                                    Users mailing list
                                                    [email protected]
                                                    <mailto:[email protected]>
                                                    
http://lists.ovirt.org/mailman/listinfo/users


                                                
_______________________________________________
                                                Users mailing list
                                                [email protected]
                                                <mailto:[email protected]>
                                                
http://lists.ovirt.org/mailman/listinfo/users




_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to