Hi Takatoshi, I have restored the PSQL to run without corosync so I cannot send you the crm_mon output now.
What I can tell for sure: - RA never promoted any of the nodes, no matter what the status was. It also did not promote the node, when it was the only one. - I believe the issue is in the comparison of the xlogs. How could I troubleshoot that? I see from the logs that crm NEVER tried to invoke pgsql with "promote" - I tried previously the crm_mon -A option, but there was never a " pgsql-data-status" attribute. The other attribs were there, including the HS:alone - In the corosync log the only relevant RA message I see is " Master is not exist. " I never saw a message like "My data is out-of-date" Thank you! Attila -----Original Message----- From: Takatoshi MATSUO [mailto:matsuo....@gmail.com] Sent: 2011. november 25. 8:56 To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Postgresql streaming replication failover - RA needed Hi Attila 2011/11/24 Attila Megyeri <amegy...@minerva-soft.com>: > Hi Takatoshi, All, > > Thanks for your reply. > I see that you have invested significant effort in the development of the RA. > I spent the last day trying to set up the RA, but without much success. > > My infrastructure is very similar to yours, except for the fact that > currently I am testing with a single network adapter. > > Replication works nicely when I start the databases manually, not using > corosync. > > When I try to start using corosync,I see that the ping resources start > normally, but the msPostgresql starts on both nodes in slave mode, and I see > "HS:alone" To see "HS:alone" is normal. And RA compares xlog locations and promote the postgresql having new data. > In the Wiki you state, the if I start on a signle node only, PSQL should > start in Master mode (PRI), but this is not the case. If the data is old, the node can't be master. To be master needs pgsql-data-status="LATEST" or "STREAMING|SYNC". Plese check it using "crm_mon -A". And to become a master from stopped takes a few minutes because the RA compares xlog location on monitor. > The recovery.conf file is created immediately, and from the logs I see no > attempt at all to promote the node. > In the postgres logs I see that node1, which is supposed to be a master, > tries to connect to the vip-rep IP address, which is NOT brought up, because > it depends on the Master role... > > Do you have any idea? Please check HA log. My RA outputs "My data is out-of-date. status=********" to log if the data is old. Regards, Takatoshi MATSUO _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org