> > > such messages (for now). But, anyway, DNS names in ringX_addr seem not > > working, and no relevant messages are in default logs. Maybe add some > > validations for ringX_addr? > > > > I'm having resolvable DNS names: > > > > root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from > > 64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms > > > > This is problem. Resolving node1 to localhost (127.0.0.1) is simply > wrong. Names you want to use in corosync.conf should resolve to > interface address. I believe other nodes has similar setting (so node2 > resolved on node2 is again 127.0.0.1) >
Wow! What a shame! How could I miss it... So you're absolutely right, thanks: that was the cause, an entry in /etc/hosts. On some machines I removed it manually, but on others - didn't. Now I do it automatically by sed -i -r "/^.*[[:space:]]$host([[:space:]]|\$)/d" /etc/hosts in the initialization script. I apologize for the mess. So now I have only one place in corosync.conf where I need to specify a plain IP address for UDPu: totem.interface.bindnetaddr. If I specify 0.0.0.0 there, I'm having a message "Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'" in the logs (BTW it does not say that I mistaked in bindnetaddr). Is there a way to completely untie from IP addresses? > Please try to fix this problem first and let's see if this will solve > issue you are hitting. > > Regards, > Honza > > > root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from > > 64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms > > > > root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from > > 64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms > > > > > > With corosync.conf below, nothing works: > > ... > > nodelist { > > node { > > ring0_addr: node1 > > } > > node { > > ring0_addr: node2 > > } > > node { > > ring0_addr: node3 > > } > > } > > ... > > Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync Cluster Engine > > ('2.3.3'): started and ready to provide service. > > Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync built-in > > features: dbus testagents rdma watchdog augeas pie relro bindnow > > Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transport > > (UDP/IP Unicast). > > Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing > > transmit/receive security (NSS) crypto: aes256 hash: sha1 > > Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] The network interface > > [a.b.c.d] is now up. > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: > > corosync configuration map access [0] > > Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cmap > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: > > corosync configuration service [1] > > Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cfg > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: > > corosync cluster closed process group service v1.01 [2] > > Jan 14 10:47:44 node1 corosync[15062]: [QB ] server name: cpg > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: > > corosync profile loading service [4] > > Jan 14 10:47:44 node1 corosync[15062]: [WD ] No Watchdog, try > modprobe > > <a watchdog> > > Jan 14 10:47:44 node1 corosync[15062]: [WD ] no resources configured. > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: > > corosync watchdog service [7] > > Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Using quorum provider > > corosync_votequorum > > Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Quorum provider: > > corosync_votequorum failed to initialize. > > Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine > > 'corosync_quorum' failed to load for reason 'configuration error: > nodelist > > or quorum.expected_votes must be configured!' > > Jan 14 10:47:44 node1 corosync[15062]: [MAIN ] Corosync Cluster Engine > > exiting with status 20 at service.c:356. > > > > > > But with IP addresses specified in ringX_addr, everything works: > > ... > > nodelist { > > node { > > ring0_addr: 104.236.71.79 > > } > > node { > > ring0_addr: 188.166.54.190 > > } > > node { > > ring0_addr: 128.199.116.218 > > } > > } > > ... > > Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync Cluster Engine > > ('2.3.3'): started and ready to provide service. > > Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync built-in > > features: dbus testagents rdma watchdog augeas pie relro bindnow > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transport > > (UDP/IP Unicast). > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing > > transmit/receive security (NSS) crypto: aes256 hash: sha1 > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] The network interface > > [a.b.c.d] is now up. > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync configuration map access [0] > > Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cmap > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync configuration service [1] > > Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cfg > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync cluster closed process group service v1.01 [2] > > Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: cpg > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync profile loading service [4] > > Jan 14 10:48:28 node1 corosync[15156]: [WD ] No Watchdog, try > modprobe > > <a watchdog> > > Jan 14 10:48:28 node1 corosync[15156]: [WD ] no resources configured. > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync watchdog service [7] > > Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Using quorum provider > > corosync_votequorum > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync vote quorum service v1.0 [5] > > Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: votequorum > > Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: > > corosync cluster quorum service v0.1 [3] > > Jan 14 10:48:28 node1 corosync[15156]: [QB ] server name: quorum > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member > > {a.b.c.d} > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member > > {e.f.g.h} > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member > > {i.j.k.l} > > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] A new membership > > (m.n.o.p:80) was formed. Members joined: 1760315215 > > Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Members[1]: 1760315215 > > Jan 14 10:48:28 node1 corosync[15156]: [MAIN ] Completed service > > synchronization, ready to provide service. > > > > > > On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse <jfrie...@redhat.com> wrote: > > > >> Dmitry, > >> > >> > >>> Sure, in logs I see "adding new UDPU member {IP_ADDRESS}" (so DNS names > >>> are definitely resolved), but in practice the cluster does not work, > as I > >>> said above. So validations of ringX_addr in corosync.conf would be very > >>> helpful in corosync. > >> > >> that's weird. Because as long as DNS is resolved, corosync works only > >> with IP. This means, code path is exactly same with IP or with DNS. Do > >> you have logs from corosync? > >> > >> Honza > >> > >> > >>> > >>> On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse <jfrie...@redhat.com> > wrote: > >>> > >>>> Dmitry, > >>>> > >>>> > >>>> No, I meant that if you pass a domain name in ring0_addr, there are > no > >>>>> errors in logs, corosync even seems to find nodes (based on its > logs), > >> And > >>>>> crm_node -l shows them, but in practice nothing really works. A > verbose > >>>>> error message would be very helpful in such case. > >>>>> > >>>> > >>>> This sounds weird. Are you sure that DNS names really maps to correct > IP > >>>> address? In logs there should be something like "adding new UDPU > member > >>>> {IP_ADDRESS}". > >>>> > >>>> Regards, > >>>> Honza > >>>> > >>>> > >>>>> On Tuesday, December 30, 2014, Daniel Dehennin < > >>>>> daniel.dehen...@baby-gnu.org> > >>>>> wrote: > >>>>> > >>>>> Dmitry Koterov <dmitry.kote...@gmail.com <javascript:;>> writes: > >>>>>> > >>>>>> Oh, seems I've found the solution! At least two mistakes was in my > >>>>>>> corosync.conf (BTW logs did not say about any errors, so my > >> conclusion > >>>>>>> is > >>>>>>> based on my experiments only). > >>>>>>> > >>>>>>> 1. nodelist.node MUST contain only IP addresses. No hostnames! They > >>>>>>> > >>>>>> simply > >>>>>> > >>>>>>> do not work, "crm status" shows no nodes. And no warnings are in > logs > >>>>>>> regarding this. > >>>>>>> > >>>>>> > >>>>>> You can add name like this: > >>>>>> > >>>>>> nodelist { > >>>>>> node { > >>>>>> ring0_addr: <public-ip-address-of-the-first-machine> > >>>>>> name: node1 > >>>>>> } > >>>>>> node { > >>>>>> ring0_addr: <public-ip-address-of-the-second-machine> > >>>>>> name: node2 > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> I used it on Ubuntu Trusty with udpu. > >>>>>> > >>>>>> Regards. > >>>>>> > >>>>>> -- > >>>>>> Daniel Dehennin > >>>>>> Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF > >>>>>> Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>> > >>>>> Project Home: http://www.clusterlabs.org > >>>>> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>> Bugs: http://bugs.clusterlabs.org > >>>>> > >>>>> > >>>> > >>>> _______________________________________________ > >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>> > >>>> Project Home: http://www.clusterlabs.org > >>>> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>> Bugs: http://bugs.clusterlabs.org > >>>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>> > >>> Project Home: http://www.clusterlabs.org > >>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>> Bugs: http://bugs.clusterlabs.org > >>> > >> > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > >> > > > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > discuss mailing list > disc...@corosync.org > http://lists.corosync.org/mailman/listinfo/discuss >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org