On 21/05/2013, at 4:19 PM, Nikita Michalko <[email protected]> wrote:

> 
> 
> Am Dienstag, 21. Mai 2013 00:00:03 schrieb DaveW:
>> We are running heartbeat 2.1.3 on CentOS 5.4.  Last Monday AM, I
> 
> - Man, so OLD! Any chance to update to the latest version ?

In haresources mode, there is very little difference

> 
> 
> Nikita Michalko    
> 
>> received a call while getting ready for work.  Our high availability
>> server was not responding.  The previous Saturday, our I.T. admins had
>> re-configured the network to expand IP address ranges on some subnets.
>> For whatever reason, this action caused our main server (in a two-node
>> HA configuration) to loose its virtual interface, rendering our
>> high-availability server unavailable.
>> 
>> The network worked fine; the nodes could ping each other based on their
>> normal IP's and they could ping the ping node, but the virtual IP (the
>> one we REALLY care about) was ignored.  Nothing in the logs, no errors,
>> nothing.   Just an unresponsive virtual server.  A manual fail-over
>> brought it back quickly as the backup took over.  I.T. had done their
>> work on Sat and, had I checked our server on Sunday, I would have found
>> it "unreachable" with a normal ping.
>> 
>> When my colleague called me, I asked him what "ifconfig" looked like.
>> He described three interfaces; eth0, eth1 and lo; no eth0:0. I had him
>> initiate the manual fail-over.
>> 
>> After pouring over the logs, unable to find anything that indicated a
>> problem, I tried to simulate the problem with "ifconfig eth0:0 down".
>> Sure enough, no fail-over, no errors, nothing; just (once again) an
>> unresponsive server.  "ifconfig eth0:0 <IP_ADDRESS> up" brought it right
>> back (I tried this last Saturday, BTW, when no one was working).  It
>> seems that heartbeat (ipfail?) creates this virtual interface when it
>> starts, then forgets about it.  I presume that the assumption is that if
>> eth0 remains intact, eth0:0 will remain intact, as well.
>> 
>> Am I missing something in the configuration settings or docs?  I find
>> nothing about configuring the backup node to monitor the virtual
>> address, just the other node (which has a different IP and kept working
>> after the network changes).  I am about to set up a service to monitor
>> the virtual IP, but I wanted to check with the list, first, to see if
>> there's already been something built in that I have not configured
>> correctly.  I have used main.company.com and backup.company.com as the
>> two hostnames of the nodes.  Both systems have these names in an
>> /etc/hosts file, along with the hostname and IP of the virtual server
>> and the ping node.
>> 
>> My configuration:
>> 
>> /etc/ha.d/ha.cf:
>> 
>> debugfile /var/log/ha-debug
>> logfile    /var/log/ha-log
>> logfacility    local0
>> keepalive 2
>> deadtime 10
>> warntime 3
>> initdead 120
>> udpport    694
>> baud    9600
>> serial    /dev/ttyS0
>> ucast eth1 10.0.0.1
>> ucast eth1 10.0.0.2
>> auto_failback off
>> node main.company.com backup.company.com
>> ping 129.196.140.130
>> respawn hacluster /usr/lib/heartbeat/ipfail
>> deadping 10
>> 
>> /etc/ha.d/haresources
>> 
>> main.company.com drbddisk::drbd_resource_0
>> Filesystem::/dev/drbd0::/usr0::ext3 mysql IPaddr::129.196.140.14 httpd
>> smb MailTo::root
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to