Hi I'm currently building a 2 node DRBD backed PostgreSQL on Debian Wheezy and I'm testing how Pacemaker reacts to specific failure scenarios.
One thing I did test that currently drives me crazy is when I manually stop PostgreSQL trough pg_ctl or just kill the master process to simulate a crash the pgsql resource agent correctly detects the error and restarts PostgreSQL. The problem is have arises when I later call 'crm resource cleanup pgsql' to delete the failcount and the failed tasks the pgsql resources shows up as Stopped, but in reality it is still running fine. I'm having the same problem when I delete the failcount separately and then do the cleanup. The problem seems to be that psql_monitor runs into a timeout: Feb 21 12:47:59 vm-db-01 crmd: [6494]: WARN: cib_action_update: rsc_op 44: pgsql_monitor_30000 on vm-db-01 timed out After the timeout pgsql is being restarted, and the interesting thing is that I can delete the failed action from the timeout without a problem. Does anyone have an idea what the problem could be in this case? Best regards Lukas -- Adfinis SyGroup AG Lukas Grossar, System Engineer Keltenstrasse 98 | CH-3018 Bern Tel. 031 550 31 11 | Direkt 031 550 31 06
signature.asc
Description: PGP signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
