Thanks Andreas -- I'm not familiar with "maintenance-mode", very good to know.
Before I go research your suggestion, is the basic idea that you can enable maintenance mode from ANY node (or just the node with the failed action?), restart pacemaker/corosync services on ALL nodes (or, again, just the one with the failed action?) -- all without any cluster service interruption -- and then disable maintenance mode once the cleaned up "Failed Actions" have been resolved? > >Message: 3 >Date: Wed, 11 Apr 2012 00:12:10 +0200 >From: Andreas Kurz <andr...@hastexo.com> >To: pacemaker@oss.clusterlabs.org >Subject: Re: [Pacemaker] DRBD Split-brain (recovered), but still > showing "Failed Actions" >Message-ID: <4f84b03a.4030...@hastexo.com> >Content-Type: text/plain; charset="iso-8859-1" > >On 04/10/2012 05:43 PM, Reid, Mike wrote: >> Thank you for the suggestion, Andreas. Unfortunately, that does not >>appear >> to have cleaned up the Failed Actions either: >> >>> crm resource cleanup msDRBD >> >> Cleaning up resDRBD:0 on hostname2 >> Cleaning up resDRBD:1 on hostname2 >> Cleaning up resDRBD:0 on hostname1 >> Cleaning up resDRBD:1 on hostname1 >> >>> crm_mon -1 >> >> [...] >> Failed actions: >> resDRBD:1_promote_0 (node=hostname2, call=530, rc=-2, status=Timed >> Out): unknown exec error >> >> >> Are there any other options that do not involve a failover + restart? > >If you switch your cluster into maintenance mode ... > >crm configure property maintenance-mode=true > >... you can stop pacemaker and even corosync without interrupting your >services ... don't forget to disable it again after restart. > >Regards, >Andreas > >-- >Need help with Pacemaker? >http://www.hastexo.com/now _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org