On 5 Nov 2013, at 2:22 am, Vladislav Bogdanov <bub...@hoster-ok.com> wrote:
> Hi Andrew, David, all, > > Just found interesting fact, don't know is it a bug or not. > > When doing service pacemaker stop on a node which has drbd resource > promoted, that resource does not promote on another node, and promote > operation timeouts. > > This is related to drbd fence integration with pacemaker and to > insufficient default (recommended) promote timeout for drbd resource. > > crm-fence-peer.sh places constraint to cib one second after promote > operation timeouts (promote op has 90s timeout, and crm-fence-peer.sh > uses that value as a timeout, and fully utilizes it if it cannot say for > sure that peer node is in a "sane" state - online or cleanly offline). > > It seems like increasing promote op timeout helps, but, I'd expect that > to complete almost immediately, instead of waiting extra 90 seconds for > nothing. > > Looking at crm-fence-peer.sh script, it would determine peer state as > offline immediately if node state (all of) > * doesn't contain "expected" tag or has it set to "down" > * has "in_ccm" tag set to false > * has "crmd" tag set to anything except "online" > > On the other hand, crmd sets "expected" = "down" only after fencing is > complete (probably the same for "in_ccm"?). Shouldn't is do the same (or > may be just remove that tag) if clean shutdown about to be complete? That would make sense. Are you using the plugin, cman or corosync 2? > Or may be it is possible to provide some different hint for > crm_fence_peer.sh? > > Another option (actually hack) would be to delay shutdown between > resources stop and processes stop (so drbd handler on the other node > determines peer is still online, and places constraint immediately), but > that is very fragile. > > pacemaker is one-week-old merge of clusterlab and bekkhof masters, drbd > is 8.4.4. All runs on corosync2 (2.3.1) with libqb 0.16 on CentOS6. > > Vladislav > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org