Hi Steven I made a patch as a trial. https://github.com/t-matsuo/resource-agents/commit/bd3b587c6665c4f5eba0491b91f83965e601bb6b#heartbeat/pgsql
This patch never show "STREAMING|POTENTIAL". Thanks, Takatoshi MATSUO 2013/4/4 Takatoshi MATSUO <matsuo....@gmail.com>: > Hi Steven > > Sorry for late reply > > 2013/3/29 Steven Bambling <smbambl...@arin.net>: >> Taskatoshi/Rainer thanks so much for the quick responses and clarification. >> >> In response to the rep_mode being set to sync. >> >> If the master is running the monitor check as low as every 1s, then its >> updating the nodes with the "new" master preference in the event that the >> current synchronous replica couldn't be reached and the postgres service >> then selected the next node in the synchronous_standby_names list to perform >> they synchronous transaction with. >> >> If you are doing multiple transactions a second then doesn't it become >> possible for the postgres service to switch it synchronous replication >> replica ( from node2 to node3 for instance ) and potentially fail ( though I >> think the risk seems small ) before the monitor function is invoke to update >> the master preference? >> >> In this case you've committed a transaction(s) and reported it back to your >> application that it was successful, but when the new master is promoted it >> doesn't have the committed transactions because it is located on the other >> replica ( and the failed master ). Basically you've lost these >> transactions even though they were reported successful. > > Yes ! > I didn't consider this situation. > >> >> The only way I can see getting around this would be to compare the current >> xlog locations on each of the remaining replicas, the promoting the one that >> meets your business needs. >> 1. If you need to have greater data consistency. >> - promote the node that has the furtherest log location even >> IF they haven't been replayed and there is some "recovery" period. >> >> 2. If you need to have greater "up time" >> - promote the node that has the furtherest log location, >> taking into account the replay lag >> - promote the node that has the furthest head or >> near furthest ahead log location and the LESS replay lag. > > How do slaves get "up time" ? > I think slaves can't know the replay lag. > >> Does this even seem possible with a resource agent or is my thinking totally >> off? > > Method 1 and 2 may cause data loss. > If you can accept data loss, you use "rep_mode=async". > It's about the same as method 1. > > > How about refraining from switching synchronous replication replica to avoid > data loss to set one node into "synchronous_standby_names" ? > > > Thanks, > Takatoshi MATSUO _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org