Great, it works! Thank you. It would be extremely helpful if this information will be included in a default corosync.conf as comments: - regarding allowed and even preferred absense of totem.interface in case of UDPu - that quorum section must not be empty, and that the default quorum.provider could be corosync_votequorum (but not empty).
It would help to install and launch corosync instantly by novices. On Fri, Jan 16, 2015 at 7:31 PM, Jan Friesse <jfrie...@redhat.com> wrote: > Dmitry Koterov napsal(a): > >> >>> such messages (for now). But, anyway, DNS names in ringX_addr seem not >>>> working, and no relevant messages are in default logs. Maybe add some >>>> validations for ringX_addr? >>>> >>>> I'm having resolvable DNS names: >>>> >>>> root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from >>>> 64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms >>>> >>>> >>> This is problem. Resolving node1 to localhost (127.0.0.1) is simply >>> wrong. Names you want to use in corosync.conf should resolve to >>> interface address. I believe other nodes has similar setting (so node2 >>> resolved on node2 is again 127.0.0.1) >>> >>> >> Wow! What a shame! How could I miss it... So you're absolutely right, >> thanks: that was the cause, an entry in /etc/hosts. On some machines I >> removed it manually, but on others - didn't. Now I do it automatically >> by sed -i -r "/^.*[[:space:]]$host([[:space:]]|\$)/d" /etc/hosts in the >> initialization script. >> >> I apologize for the mess. >> >> So now I have only one place in corosync.conf where I need to specify a >> plain IP address for UDPu: totem.interface.bindnetaddr. If I specify >> 0.0.0.0 there, I'm having a message "Service engine 'corosync_quorum' >> failed to load for reason 'configuration error: nodelist or >> quorum.expected_votes must be configured!'" in the logs (BTW it does not >> say that I mistaked in bindnetaddr). Is there a way to completely untie >> from IP addresses? >> > > You can just remove whole interface section completely. Corosync will find > correct address from nodelist. > > Regards, > Honza > > > >> >> >> Please try to fix this problem first and let's see if this will solve >>> issue you are hitting. >>> >>> Regards, >>> Honza >>> >>> root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from >>>> 64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms >>>> >>>> root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from >>>> 64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms >>>> >>>> >>>> With corosync.conf below, nothing works: >>>> ... >>>> nodelist { >>>> node { >>>> ring0_addr: node1 >>>> } >>>> node { >>>> ring0_addr: node2 >>>> } >>>> node { >>>> ring0_addr: node3 >>>> } >>>> } >>>> ... >>>> Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync Cluster Engine >>>> ('2.3.3'): started and ready to provide service. >>>> Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync built-in >>>> features: dbus testagents rdma watchdog augeas pie relro bindnow >>>> Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transport >>>> (UDP/IP Unicast). >>>> Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing >>>> transmit/receive security (NSS) crypto: aes256 hash: sha1 >>>> Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] The network interface >>>> [a.b.c.d] is now up. >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: >>>> corosync configuration map access [0] >>>> Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cmap >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: >>>> corosync configuration service [1] >>>> Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cfg >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: >>>> corosync cluster closed process group service v1.01 [2] >>>> Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cpg >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: >>>> corosync profile loading service [4] >>>> Jan 14 10:47:44 node1 corosync[15062]: [WD ] No Watchdog, try >>>> >>> modprobe >>> >>>> <a watchdog> >>>> Jan 14 10:47:44 node1 corosync[15062]: [WD ] no resources >>>> configured. >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: >>>> corosync watchdog service [7] >>>> Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Using quorum provider >>>> corosync_votequorum >>>> Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Quorum provider: >>>> corosync_votequorum failed to initialize. >>>> Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine >>>> 'corosync_quorum' failed to load for reason 'configuration error: >>>> >>> nodelist >>> >>>> or quorum.expected_votes must be configured!' >>>> Jan 14 10:47:44 node1 corosync[15062]: [MAIN ] Corosync Cluster Engine >>>> exiting with status 20 at service.c:356. >>>> >>>> >>>> But with IP addresses specified in ringX_addr, everything works: >>>> ... >>>> nodelist { >>>> node { >>>> ring0_addr: 104.236.71.79 >>>> } >>>> node { >>>> ring0_addr: 188.166.54.190 >>>> } >>>> node { >>>> ring0_addr: 128.199.116.218 >>>> } >>>> } >>>> ... >>>> Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync Cluster Engine >>>> ('2.3.3'): started and ready to provide service. >>>> Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync built-in >>>> features: dbus testagents rdma watchdog augeas pie relro bindnow >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transport >>>> (UDP/IP Unicast). >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing >>>> transmit/receive security (NSS) crypto: aes256 hash: sha1 >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] The network interface >>>> [a.b.c.d] is now up. >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync configuration map access [0] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cmap >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync configuration service [1] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cfg >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync cluster closed process group service v1.01 [2] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cpg >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync profile loading service [4] >>>> Jan 14 10:48:28 node1 corosync[15156]: [WD ] No Watchdog, try >>>> >>> modprobe >>> >>>> <a watchdog> >>>> Jan 14 10:48:28 node1 corosync[15156]: [WD ] no resources >>>> configured. >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync watchdog service [7] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Using quorum provider >>>> corosync_votequorum >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync vote quorum service v1.0 [5] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: votequorum >>>> Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: >>>> corosync cluster quorum service v0.1 [3] >>>> Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: quorum >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member >>>> {a.b.c.d} >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member >>>> {e.f.g.h} >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member >>>> {i.j.k.l} >>>> Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] A new membership >>>> (m.n.o.p:80) was formed. Members joined: 1760315215 >>>> Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Members[1]: 1760315215 >>>> Jan 14 10:48:28 node1 corosync[15156]: [MAIN ] Completed service >>>> synchronization, ready to provide service. >>>> >>>> >>>> On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse <jfrie...@redhat.com> >>>> wrote: >>>> >>>> Dmitry, >>>>> >>>>> >>>>> Sure, in logs I see "adding new UDPU member {IP_ADDRESS}" (so DNS >>>>>> names >>>>>> are definitely resolved), but in practice the cluster does not work, >>>>>> >>>>> as I >>> >>>> said above. So validations of ringX_addr in corosync.conf would be very >>>>>> helpful in corosync. >>>>>> >>>>> >>>>> that's weird. Because as long as DNS is resolved, corosync works only >>>>> with IP. This means, code path is exactly same with IP or with DNS. Do >>>>> you have logs from corosync? >>>>> >>>>> Honza >>>>> >>>>> >>>>> >>>>>> On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse <jfrie...@redhat.com> >>>>>> >>>>> wrote: >>> >>>> >>>>>> Dmitry, >>>>>>> >>>>>>> >>>>>>> No, I meant that if you pass a domain name in ring0_addr, there are >>>>>>> >>>>>> no >>> >>>> errors in logs, corosync even seems to find nodes (based on its >>>>>>>> >>>>>>> logs), >>> >>>> And >>>>> >>>>>> crm_node -l shows them, but in practice nothing really works. A >>>>>>>> >>>>>>> verbose >>> >>>> error message would be very helpful in such case. >>>>>>>> >>>>>>>> >>>>>>> This sounds weird. Are you sure that DNS names really maps to correct >>>>>>> >>>>>> IP >>> >>>> address? In logs there should be something like "adding new UDPU >>>>>>> >>>>>> member >>> >>>> {IP_ADDRESS}". >>>>>>> >>>>>>> Regards, >>>>>>> Honza >>>>>>> >>>>>>> >>>>>>> On Tuesday, December 30, 2014, Daniel Dehennin < >>>>>>>> daniel.dehen...@baby-gnu.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Dmitry Koterov <dmitry.kote...@gmail.com <javascript:;>> writes: >>>>>>>> >>>>>>>>> >>>>>>>>> Oh, seems I've found the solution! At least two mistakes was in >>>>>>>>> my >>>>>>>>> >>>>>>>>>> corosync.conf (BTW logs did not say about any errors, so my >>>>>>>>>> >>>>>>>>> conclusion >>>>> >>>>>> is >>>>>>>>>> based on my experiments only). >>>>>>>>>> >>>>>>>>>> 1. nodelist.node MUST contain only IP addresses. No hostnames! >>>>>>>>>> They >>>>>>>>>> >>>>>>>>>> simply >>>>>>>>> >>>>>>>>> do not work, "crm status" shows no nodes. And no warnings are in >>>>>>>>>> >>>>>>>>> logs >>> >>>> regarding this. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> You can add name like this: >>>>>>>>> >>>>>>>>> nodelist { >>>>>>>>> node { >>>>>>>>> ring0_addr: <public-ip-address-of-the-first-machine> >>>>>>>>> name: node1 >>>>>>>>> } >>>>>>>>> node { >>>>>>>>> ring0_addr: <public-ip-address-of-the-second-machine> >>>>>>>>> name: node2 >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> I used it on Ubuntu Trusty with udpu. >>>>>>>>> >>>>>>>>> Regards. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Daniel Dehennin >>>>>>>>> Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF >>>>>>>>> Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>> >>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>> Getting started: >>>>>>>> >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> >>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: >>>>>>> >>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> >>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: >>>>>> >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> >>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>>>> >>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> >>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/ >>>> doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> >>> _______________________________________________ >>> discuss mailing list >>> disc...@corosync.org >>> http://lists.corosync.org/mailman/listinfo/discuss >>> >>> >> >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org