On Thu, Jan 28, 2010 at 3:51 PM, Eric Blau <[email protected]> wrote:
> Hi Linux HA list,
>
> I'm having this same problem that was reported previously with 2 servers
> paired up that are not communicating with each other.  Each shows the other
> as offline in crm_mon.  I'm running Linux HA 2.1.4 in CRM mode.  I see these
> messages in the log file:
>
> cib[20479]: 2010/01/28_08:58:25 info: write_cib_contents: Wrote version
> 0.601.1 of the CIB to disk (digest: 22cd418a378a5ee22c1cc6347fa69817)
> cib[18546]: 2010/01/28_08:58:25 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (732b9) from so1b: not in our membership
> cib[18546]: 2010/01/28_08:58:25 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (732bb) from so1b: not in our membership
> cib[20479]: 2010/01/28_08:58:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[20479]: 2010/01/28_08:58:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> cib[18546]: 2010/01/28_08:58:26 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (732c7) from so1b: not in our membership
>
> Each server appears to be rejecting the other from membership.  They were
> working fine and arbitrating an IPaddr2 resource before a split brain
> occurred.  After the split brain recovered, these errors started appearing.
> I've verified with tcpdump that heartbeat connectivity is intact.
>
> Any ideas?

Basically, you need to get a recent version of Pacemaker.
Heartbeat 2.1.4 is old enough to be hitting the bug I mentioned.

>
> Thanks in advance for any help!
>
> Regards,
> Eric Blau
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Andrew Beekhof
> Sent: Friday, 24 July, 2009 08:39
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Node ha2 is not sync with node ha1
>
> What version are you using?
> There was a bug like this but it was fixed a long time ago
>
> On Wed, Jul 22, 2009 at 10:02 AM, Ahmed Munir<[email protected]>
> wrote:
>> Hi all,
>> Hoping you all fine. I've got 2 machines and I've installed Linux HA and
>> OpenSIPs on them and configured them as an active-active scenario. Machine
> 1
>> named ha1, is assigned with virtual IP 192.168.0.184 and machine 2 named
>> ha2, is assigned with virtual IP 192.168.0.185.
>>
>> The integration between HA and OpenSIPs is working fine. Like if I stop
> the
>> service of  HA, machine ha1 comes down, its resources are taken by machine
>> ha2 and when ha1 comes online, ha1 take its resources back from machine
> ha2
>> and vice versa.
>>
>> If I turn off ha1 machine its resources are taken by machine ha2 and
>> when ha1 comes online, ha1 take its resources back from machine ha2 which
> is
>> working fine. But when I turn off ha2 machine its resources are taken by
>> machine ha1 and when ha2 comes online, and I check the status of ha2 using
>> crm_mon command,
>> it shows me weird status as I'm listing down below;
>>
>> On ha1 machine;
>>
>> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): online
>> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): offline
>>
>> IPaddr_1           (heartbeat::ocf:IPaddr):        Started ha1
>> IPaddr_2           (heartbeat::ocf:IPaddr):        Started ha1
>> OpenSips_1      (heartbeat::ocf:OpenSips):      Started ha1
>> OpenSips_2      (heartbeat::ocf:OpenSips):      Started ha1
>>
>> On ha2 machine;
>>
>> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): offline
>> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): online
>>
>> IPaddr_1           (heartbeat::ocf:IPaddr):        Started ha2
>> IPaddr_2           (heartbeat::ocf:IPaddr):        Started ha2
>> OpenSips_1      (heartbeat::ocf:OpenSips):      Started ha2
>> OpenSips_2      (heartbeat::ocf:OpenSips):      Started ha2
>>
>> Or sometimes on ha2 machine;
>>
>> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): online
>> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): offline
>>
>> IPaddr_1           (heartbeat::ocf:IPaddr):        Started ha1
>> IPaddr_2           (heartbeat::ocf:IPaddr):        Started ha1
>> OpenSips_1      (heartbeat::ocf:OpenSips):      Started ha1
>> OpenSips_2      (heartbeat::ocf:OpenSips):      Started ha1
>>
>> After that I've checked logs and I'm getting these errors as listed below;
>>
>> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3a9) from ha2: not in our membership
>> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3aa) from ha2: not in our membership
>> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3ab) from ha2: not in our membership
>> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3ac) from ha2: not in our membership
>> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3ad) from ha2: not in our membership
>> Jul 22 14:12:07 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding
>> cib_apply_diff message (3b0) from ha2: not in our membership
>> Jul 22 14:12:07 ha1 ccm: [9977]: ERROR: llm_set_uptime: Negative uptime
>> -1778384896 for node 0 [ha1]
>> Jul 22 14:12:07 ha1 ccm: [9977]: ERROR: llm_set_uptime: Negative uptime
>> -1879048192 for node 1 [ha2]
>>
>> Even I've configured same settings on both machines but I don't know  why
>> I'm getting these errors.
>>
>> Further added I'm attaching cib.xml, OpenSips (which I created resource
> file
>> for OpenSIPs), ha.cf and log files. Kindly do have a look and update
>> me ASAP.
>>
>>
>> --
>> Regards,
>>
>> Ahmed Munir
>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to