Helllo Jan, I'm using corosync+pacemaker on Sles 11 Sp1 and this is a critical system, i don't think i'll get the authorization for upgrade system, but i would like to know if there is any bug about this issue in my current corosync release.
Thanks Emmanuel 2014-04-30 17:07 GMT+02:00 Jan Friesse <jfrie...@redhat.com>: > Emmanuel, > > emmanuel segura napsal(a): > > Hello Jan, > > > > Thanks for the explanation, but i saw this in my log. > > > > > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > > > corosync [TOTEM ] Process pause detected for 577 ms, flushing membership > > messages. > > corosync [TOTEM ] Process pause detected for 538 ms, flushing membership > > messages. > > corosync [TOTEM ] A processor failed, forming new configuration. > > corosync [CLM ] CLM CONFIGURATION CHANGE > > corosync [CLM ] New Configuration: > > corosync [CLM ] r(0) ip(10.xxx.xxx.xxx) > > corosync [CLM ] Members Left: > > corosync [CLM ] r(0) ip(10.xxx.xxx.xxx) > > corosync [CLM ] Members Joined: > > corosync [pcmk ] notice: pcmk_peer_update: Transitional membership event > > on ring 6904: memb=1, new=0, lost=1 > > corosync [pcmk ] info: pcmk_peer_update: memb: node01 891257354 > > corosync [pcmk ] info: pcmk_peer_update: lost: node02 874480 > > > > > ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > > > when this happen, corosync needs to retransmit the toten? > > from what i understood the toten need to be retransmit, but in my case a > > new configuration was formed > > > > This my corosync version > > > > corosync-1.3.3-0.3.1 > > > > 1.3.3 is unsupported for ages. Please upgrade to newest 1.4.6 (if you > are using cman) or 2.3.3 (if you are not using cman). Also please change > your pacemaker to not use plugin (upgrade to 2.3.3 will solve it > automatically, because plugins in corosync 2.x are no longer support). > > Regards, > Honza > > > > Thanks > > > > > > 2014-04-30 9:42 GMT+02:00 Jan Friesse <jfrie...@redhat.com>: > > > >> Emmanuel, > >> there is no need to trigger fencing on "Process pause detected...". > >> > >> Also fencing is not triggered if membership didn't changed. So let's say > >> token was lost but during gather state all nodes replied, then there is > >> no change of membership and no need to fence. > >> > >> I believe your situation was: > >> - one node is little overloaded > >> - token lost > >> - overload over > >> - gather state > >> - every node is alive > >> -> no fencing > >> > >> Regards, > >> Honza > >> > >> emmanuel segura napsal(a): > >>> Hello Jan, > >>> > >>> Forget the last mail: > >>> > >>> Hello Jan, > >>> > >>> I found this problem in two hp blade system and the strange thing is > the > >>> fencing was not triggered :(, but it's enabled > >>> > >>> > >>> 2014-04-25 18:36 GMT+02:00 emmanuel segura <emi2f...@gmail.com>: > >>> > >>>> Hello Jan, > >>>> > >>>> I found this problem in two hp blade system and the strange thing is > the > >>>> fencing was triggered :( > >>>> > >>>> > >>>> 2014-04-25 9:27 GMT+02:00 Jan Friesse <jfrie...@redhat.com>: > >>>> > >>>> Emanuel, > >>>>> > >>>>> emmanuel segura napsal(a): > >>>>> > >>>>> Hello List, > >>>>>> > >>>>>> I have this two lines in my cluster logs, somebody can help to know > >> what > >>>>>> this means. > >>>>>> > >>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>>>> :::::::::::::: > >>>>>> > >>>>>> corosync [TOTEM ] Process pause detected for 577 ms, flushing > >> membership > >>>>>> messages. > >>>>>> corosync [TOTEM ] Process pause detected for 538 ms, flushing > >> membership > >>>>>> messages. > >>>>>> > >>>>> > >>>>> Corosync internally checks gap between member join messages. If such > >> gap > >>>>> is > token/2, it means, that corosync was not scheduled to run by > >> kernel > >>>>> for too long, and it should discard membership messages. > >>>>> > >>>>> Original intend was to detect paused process. If pause is detected, > >> it's > >>>>> better to discard old membership messages and initiate new query then > >>>>> sending outdated view. > >>>>> > >>>>> So there are various reasons why this is triggered, but today it's > >>>>> usually VM with overloaded host machine. > >>>>> > >>>>> > >>>>> > >>>>> corosync [TOTEM ] A processor failed, forming new configuration. > >>>>>> > >>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > >>>>>> :::::::::::::: > >>>>>> > >>>>>> I know the "corosync [TOTEM ] A processor failed, forming new > >>>>>> configuration" message is when the toten package is definitely lost. > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> > >>>>> Regards, > >>>>> Honza > >>>>> > >>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>> > >>>>>> Project Home: http://www.clusterlabs.org > >>>>>> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>> Bugs: http://bugs.clusterlabs.org > >>>>>> > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>> > >>>>> Project Home: http://www.clusterlabs.org > >>>>> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>> Bugs: http://bugs.clusterlabs.org > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> esta es mi vida e me la vivo hasta que dios quiera > >>>> > >>> > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>> > >>> Project Home: http://www.clusterlabs.org > >>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>> Bugs: http://bugs.clusterlabs.org > >>> > >> > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > >> > > > > > > > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org