On 03/18/2014 09:04 PM, Andrew Beekhof wrote:
> Riiiight, so this is the story:
>
> Mar 08 08:43:22 [9934] lorien       crmd:     info: do_dc_takeover:   Taking 
> over DC status for this partition
> Mar 08 08:43:22 [9934] lorien       crmd:   notice: tengine_stonith_notify:   
> Peer gandalf was terminated (st_notify_fence) by mordor for gandalf: OK 
> (ref=10d27664-33ed-43e0-a5bd-7d0ef850eb05) by client crmd.31561
> Mar 08 08:43:22 [9934] lorien       crmd:   notice: tengine_stonith_notify:   
> Notified CMAN that 'gandalf' is now fenced
> Mar 08 08:43:22 [9934] lorien       crmd:   notice: tengine_stonith_notify:   
> Target may have been our leader gandalf (recorded: <unset>)
> Mar 08 09:13:52 [9934] lorien       crmd:     info: do_dc_takeover:   Taking 
> over DC status for this partition
> Mar 08 09:13:52 [9934] lorien       crmd:   notice: do_dc_takeover:   Marking 
> gandalf, target of a previous stonith action, as clean
>
> In tengine_stonith_notify() we potentially add things to stonith_cleanup_list 
> and then in do_dc_takeover() we check the stonith_cleanup_list and mark any 
> nodes in it as clean.
>
> As you can see above, the stonith notification comes just after the call to 
> do_dc_takeover().
> In the version you have there is some dodgy code in tengine_stonith_notify() 
> which incorrectly adds gandalf to stonith_cleanup_list, causing Pacemaker to 
> (incorrectly) erase its status section at 9:13:52 when another election 
> occurs.
>
> This was fixed during the RC-phase of Pacemaker-1.1.10:
>
>   https://github.com/beekhof/pacemaker/commit/f30e1e43
>
> I don't believe I quite understood the severity of that fix at the time 
> (otherwise I'd have made more noise about it).
>
> Since you're on CentOS 6.4, there should already be updated packages that 
> include this fix.

Andrew: thanks again for taking the time to check this case. We will be 
updating to 1.1.10 as soon as possible. Hugs!


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to