[Linux-HA] pgsql resource agent in status "Stopped" after crm resource cleanup

Lukas Grossar Fri, 21 Feb 2014 03:56:33 -0800

Hi

I'm currently building a 2 node DRBD backed PostgreSQL on Debian Wheezy
and I'm testing how Pacemaker reacts to specific failure scenarios.


One thing I did test that currently drives me crazy is when I manually
stop PostgreSQL trough pg_ctl or just kill the master process to
simulate a crash the pgsql resource agent correctly detects the error
and restarts PostgreSQL.

The problem is have arises when I later call 'crm resource cleanup
pgsql' to delete the failcount and the failed tasks the pgsql resources
shows up as Stopped, but in reality it is still running fine. I'm
having the same problem when I delete the failcount separately and then
do the cleanup.

The problem seems to be that psql_monitor runs into a timeout:
Feb 21 12:47:59 vm-db-01 crmd: [6494]: WARN: cib_action_update:
rsc_op 44: pgsql_monitor_30000 on vm-db-01 timed out

After the timeout pgsql is being restarted, and the interesting thing
is that I can delete the failed action from the timeout without a
problem.

Does anyone have an idea what the problem could be in this case?

Best regards
Lukas

-- 
Adfinis SyGroup AG
Lukas Grossar, System Engineer

Keltenstrasse 98 | CH-3018 Bern
Tel. 031 550 31 11 | Direkt 031 550 31 06

signature.asc
Description: PGP signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] pgsql resource agent in status "Stopped" after crm resource cleanup

Reply via email to