Kostiantyn,
> One more thing to clarify. > You said "rebind can be avoided" - what does it mean? By that I mean that as long as you don't shutdown interface everything will work as expected. Interface shutdown is administrator decision, system doesn't do it automagically :) Regards, Honza > > Thank you, > Kostya > > On Wed, Jan 14, 2015 at 1:31 PM, Kostiantyn Ponomarenko < > konstantin.ponomare...@gmail.com> wrote: > >> Thank you. Now I am aware of it. >> >> Thank you, >> Kostya >> >> On Wed, Jan 14, 2015 at 12:59 PM, Jan Friesse <jfrie...@redhat.com> wrote: >> >>> Kostiantyn, >>> >>>> Honza, >>>> >>>> Thank you for helping me. >>>> So, there is no defined behavior in case one of the interfaces is not in >>>> the system? >>> >>> You are right. There is no defined behavior. >>> >>> Regards, >>> Honza >>> >>> >>>> >>>> >>>> Thank you, >>>> Kostya >>>> >>>> On Tue, Jan 13, 2015 at 12:01 PM, Jan Friesse <jfrie...@redhat.com> >>> wrote: >>>> >>>>> Kostiantyn, >>>>> >>>>> >>>>>> According to the https://access.redhat.com/solutions/638843 , the >>>>>> interface, that is defined in the corosync.conf, must be present in >>> the >>>>>> system (see at the bottom of the article, section "ROOT CAUSE"). >>>>>> To confirm that I made a couple of tests. >>>>>> >>>>>> Here is a part of the corosync.conf file (in a free-write form) (also >>>>>> attached the origin config file): >>>>>> =============================== >>>>>> rrp_mode: passive >>>>>> ring0_addr is defined in corosync.conf >>>>>> ring1_addr is defined in corosync.conf >>>>>> =============================== >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> Two-node cluster >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> Test #1: >>>>>> -------------------------------------------------- >>>>>> IP for ring0 is not defines in the system: >>>>>> -------------------------------------------------- >>>>>> Start Corosync simultaneously on both nodes. >>>>>> Corosync fails to start. >>>>>> From the logs: >>>>>> Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] parse error in >>>>>> config: No interfaces defined >>>>>> Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] Corosync >>> Cluster >>>>>> Engine exiting with status 8 at main.c:1343. >>>>>> Result: Corosync and Pacemaker are not running. >>>>>> >>>>>> Test #2: >>>>>> -------------------------------------------------- >>>>>> IP for ring1 is not defines in the system: >>>>>> -------------------------------------------------- >>>>>> Start Corosync simultaneously on both nodes. >>>>>> Corosync starts. >>>>>> Start Pacemaker simultaneously on both nodes. >>>>>> Pacemaker fails to start. >>>>>> From the logs, the last writes from the "corosync": >>>>>> Jan 8 16:31:29 daemon.err<27> corosync[3728]: [TOTEM ] Marking ringid >>> 0 >>>>>> interface 169.254.1.3 FAULTY >>>>>> Jan 8 16:31:30 daemon.notice<29> corosync[3728]: [TOTEM ] >>> Automatically >>>>>> recovered ring 0 >>>>>> Result: Corosync and Pacemaker are not running. >>>>>> >>>>>> >>>>>> Test #3: >>>>>> >>>>>> "rrp_mode: active" leads to the same result, except Corosync and >>>>> Pacemaker >>>>>> init scripts return status "running". >>>>>> But still "vim /var/log/cluster/corosync.log" shows a lot of errors >>> like: >>>>>> Jan 08 16:30:47 [4067] A6-402-1 cib: error: pcmk_cpg_dispatch: >>> Connection >>>>>> to the CPG API failed: Library error (2) >>>>>> >>>>>> Result: Corosync and Pacemaker show their statuses as "running", but >>>>>> "crm_mon" cannot connect to the cluster database. And half of the >>>>>> Pacemaker's services are not running (including Cluster Information >>> Base >>>>>> (CIB)). >>>>>> >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> For a single node mode >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> IP for ring0 is not defines in the system: >>>>>> >>>>>> Corosync fails to start. >>>>>> >>>>>> IP for ring1 is not defines in the system: >>>>>> >>>>>> Corosync and Pacemaker are started. >>>>>> >>>>>> It is possible that configuration will be applied successfully (50%), >>>>>> >>>>>> and it is possible that the cluster is not running any resources, >>>>>> >>>>>> and it is possible that the node cannot be put in a standby mode >>> (shows: >>>>>> communication error), >>>>>> >>>>>> and it is possible that the cluster is running all resources, but >>> applied >>>>>> configuration is not guaranteed to be fully loaded (some rules can be >>>>>> missed). >>>>>> >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> Conclusions: >>>>>> >>>>>> ------------------------------- >>>>>> >>>>>> It is possible that in some rare cases (see comments to the bug) the >>>>>> cluster will work, but in that case its working state is unstable and >>> the >>>>>> cluster can stop working every moment. >>>>>> >>>>>> >>>>>> So, is it correct? Does my assumptions make any sense? I didn't any >>> other >>>>>> explanation in the network ... . >>>>> >>>>> Corosync needs all interfaces during start and runtime. This doesn't >>>>> mean they must be connected (this would make corosync unusable for >>>>> physical NIC/Switch or cable failure), but they must be up and have >>>>> correct ip. >>>>> >>>>> When this is not the case, corosync rebinds to localhost and weird >>>>> things happens. Removal of this rebinding is long time TODO, but there >>>>> are still more important bugs (especially because rebind can be >>> avoided). >>>>> >>>>> Regards, >>>>> Honza >>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Kostya >>>>>> >>>>>> On Fri, Jan 9, 2015 at 11:10 AM, Kostiantyn Ponomarenko < >>>>>> konstantin.ponomare...@gmail.com> wrote: >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> Corosync fails to start if there is no such network interface >>> configured >>>>>>> in the system. >>>>>>> Even with "rrp_mode: passive" the problem is the same when at least >>> one >>>>>>> network interface is not configured in the system. >>>>>>> >>>>>>> Is this the expected behavior? >>>>>>> I thought that when you use redundant rings, it is enough to have at >>>>> least >>>>>>> one NIC configured in the system. Am I wrong? >>>>>>> >>>>>>> Thank you, >>>>>>> Kostya >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org