On Mon, Dec 12, 2011 at 5:32 AM, Takatoshi MATSUO <matsuo....@gmail.com>wrote:
> Hello > > 2011/12/12 Serge Dubrouski <serge...@gmail.com>: > > > > > > On Thu, Dec 8, 2011 at 10:34 PM, Takatoshi MATSUO <matsuo....@gmail.com> > > wrote: > >> > >> Hi Attila > >> > >> 2011/12/8 Attila Megyeri <amegy...@minerva-soft.com>: > >> > Hi Takatoshi, > >> > > >> > One strange thing I noticed and could probably be improved. > >> > When there is data inconsistency, I have the following node > properties: > >> > > >> > * Node psql2: > >> > + default_ping_set : 100 > >> > + master-postgresql:1 : -INFINITY > >> > + pgsql-data-status : DISCONNECT > >> > + pgsql-status : HS:alone > >> > * Node psql1: > >> > + default_ping_set : 100 > >> > + master-postgresql:0 : 1000 > >> > + master-postgresql:1 : -INFINITY > >> > + pgsql-data-status : LATEST > >> > + pgsql-master-baseline : 58:000000004B000020 > >> > + pgsql-status : PRI > >> > > >> > This is fine, and understandable - but I can see this only if I do a > >> > crm_mon -A. > >> > > >> > My problem is, that CRM shows the following: > >> > > >> > Master/Slave Set: db-ms-psql [postgresql] > >> > Masters: [ psql1 ] > >> > Slaves: [ psql2 ] > >> > > >> > So if I monitor the system from crm_mon, HAWK or ther tools - I have > no > >> > indication at all that the slave is running in an inconsistent mode. > >> > > >> > I would expect the RA to stop the psql2 node in such cases, because: > >> > - It is running, but has non-up-to-date data, therefore noone will use > >> > it (the slave IP points to the master as well, which is good) > >> > - In CRM status eveything looks perfect, even though it is NOT perfect > >> > and admin intervention is required. > >> > > >> > > >> > Shouldn't the disconnected PSQL server be stopped instead? > >> > >> hmm.. > >> It's not better to stop PGSQL server. > >> RA cannot know whether PGSQL is disconnected because of > >> data-inconsistent or network-down or > >> starting-up and so on. > > > > > > Why does it matter? If the state is degraded and inconsistent and there > is > > no way to fix it from inside of the RA, RA should probably stop it. > > In this case, HS's data may be cosistent but Primary dosen't have enough > wals or > HS dosen't have enough wal-archives to be replication-mode. > Unfortunately this RA dosen't calculate the number of wals. > Honestly I don't know how to better handle this. Pacemaker doesn't have a concept of degraded node state. > > > Let's say that there is pgpool running in front of the cluster, keeping > an > > inconsistent node up would lead to the routing SQL queries to it and > > possibly getting wrong results. > > > > It dosen't happen in my sample configuration. > vip-slave is up at master when slave is not "HS:sync". > So you have a VIP for each slave node? > > >> > >> > >> > >> How about using dummy RA such as vip-slave? > >> ------------------------------------------- > >> primitive runningSlaveOK ocf:heartbeat:Dummy > >> .....(snip) > >> > >> location rsc_location-dummy runningSlaveOK \ > >> rule 200: pgsql-status eq "HS:sync" > >> ------------------------------------------- > > > > > That probably fixes visibility issue. What about notifications on > DISCONNECT > > state? How administrator would know that cluster is inconsistent? May be > the > > better option in this case would be collocating MailTo resource with > > "HS:alone"? > > Yes, it's good idea if you want to receive notifications. > > > Regards, > Takatoshi MATSUO > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Serge Dubrouski.
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org