Hi, On Tue, Nov 08, 2011 at 04:46:53PM +0200, Matti Linnanvuori wrote: > Andreas Kurz: > > full logs around this restart would be interesting ... you tested all > > your own OCF scripts with ocf-tester and they are sane? > > Yes, I tested all of our own OCF scripts and they seem to work most of the > time. > > The following is the tail of file /var/log/corosync.log after crm resource > restart DB-daemon: > > Nov 08 16:40:26 iac-pgis cib: [2095]: info: write_cib_contents: Wrote version > 0.127.0 of the CIB to disk (digest: 695ae492c8ab41162865fad817e1a748) > Nov 08 16:40:26 iac-pgis cib: [2095]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.5GMRoq (digest: > /var/lib/heartbeat/crm/cib.xGG5ep) > Nov 08 16:41:03 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[26] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:41:03 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:41:04 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1424] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:41:04 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:41:09 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[26] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:41:09 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:41:10 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1424] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:41:10 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <cib admin_epoch="0" epoch="127" num_updates="1" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <configuration > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <resources > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <primitive id="DB-daemon" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <meta_attributes id="DB-daemon-meta_attributes" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <nvpair value="Started" id="DB-daemon-meta_attributes-target-role" /> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </meta_attributes> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </primitive> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </resources> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </configuration> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </cib> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <cib admin_epoch="0" epoch="128" num_updates="1" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <configuration > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <resources > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <primitive id="DB-daemon" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <meta_attributes id="DB-daemon-meta_attributes" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <nvpair value="Stopped" id="DB-daemon-meta_attributes-target-role" /> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </meta_attributes> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </primitive> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </resources> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </configuration> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </cib> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: cib_process_request: Operation > complete: op cib_replace for section resources (origin=local/cibadmin/2, > version=0.128.1): ok (rc=0) > Nov 08 16:41:30 iac-pgis cib: [4026]: info: write_cib_contents: Archived > previous version as /var/lib/heartbeat/crm/cib-26.raw > Nov 08 16:41:30 iac-pgis cib: [4026]: info: write_cib_contents: Wrote version > 0.128.0 of the CIB to disk (digest: 50a169997d0d9f6f7de07e3e38eec71f) > Nov 08 16:41:30 iac-pgis cib: [4026]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.BBnEZ8 (digest: > /var/lib/heartbeat/crm/cib.txMEqQ) > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <cib admin_epoch="0" epoch="128" num_updates="1" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <configuration > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <resources > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <primitive id="DB-daemon" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <meta_attributes id="DB-daemon-meta_attributes" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > <nvpair value="Stopped" id="DB-daemon-meta_attributes-target-role" /> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </meta_attributes> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </primitive> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </resources> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </configuration> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: - > </cib> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <cib admin_epoch="0" epoch="129" num_updates="1" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <configuration > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <resources > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <primitive id="DB-daemon" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <meta_attributes id="DB-daemon-meta_attributes" > > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > <nvpair value="Started" id="DB-daemon-meta_attributes-target-role" /> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </meta_attributes> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </primitive> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </resources> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </configuration> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: log_data_element: cib:diff: + > </cib> > Nov 08 16:41:30 iac-pgis cib: [2560]: info: cib_process_request: Operation > complete: op cib_replace for section resources (origin=local/cibadmin/2, > version=0.129.1): ok (rc=0) > Nov 08 16:41:30 iac-pgis cib: [4057]: info: write_cib_contents: Archived > previous version as /var/lib/heartbeat/crm/cib-27.raw > Nov 08 16:41:30 iac-pgis cib: [4057]: info: write_cib_contents: Wrote version > 0.129.0 of the CIB to disk (digest: 7853e43088a01f3af1d92260934bef45) > Nov 08 16:41:30 iac-pgis cib: [4057]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.M8mMG9 (digest: > /var/lib/heartbeat/crm/cib.JkJUOR) > Nov 08 16:42:52 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:42:52 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:42:57 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:42:57 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:43:02 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:43:02 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:43:07 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:43:07 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:43:12 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:43:12 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms > Nov 08 16:43:38 iac-pgis lrmd: [2561]: info: perform_op:2884: operation > monitor[1372] on ocf::statusmgr::statusmgr-daemon for client 2564, its > parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.2] > CRM_meta_timeout=[20000] CRM_meta_interval=[5000] > statusmgr_binary=[/etc/init.d/iac-statusmgr] for rsc is already running. > Nov 08 16:43:38 iac-pgis lrmd: [2561]: info: perform_op:2894: postponing all > ops on resource statusmgr-daemon by 1000 ms
How long does the monitor take? I didn't see your configuration, but if it takes longer than the interval you set for monitor, this looks like exactly that case. Thanks, Dejan > Matti Linnanvuori > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker