Hello Bernardo I don't know if this is the problem, but try this option
clear_node_high_bit This configuration option is optional and is only relevant when no nodeid is specified. Some openais clients require a signed 32 bit nodeid that is greater than zero however by default openais uses all 32 bits of the IPv4 address space when generating a nodeid. Set this option to yes to force the high bit to be zero and therefor ensure the nodeid is a positive signed 32 bit integer. WARNING: The clusters behavior is undefined if this option is enabled on only a subset of the cluster (for example during a rolling upgrade). Thanks 2013/6/27 Bernardo Cabezas Serra <bcabe...@apsl.net> > Hello, > > Our cluster was working OK on corosync stack, with corosync 2.3.0 and > pacemaker 1.1.8. > > After upgrading (full versions and configs below), we began to have > problems with node names. > It's a two node cluster, with node names "turifel" (DC) and "selavi". > > When selavi joins cluster, we have this warning at selavi log: > > ----- > Jun 27 11:54:29 selavi attrd[11998]: notice: corosync_node_name: > Unable to get node name for nodeid 168385827 > Jun 27 11:54:29 selavi attrd[11998]: notice: get_node_name: Defaulting > to uname -n for the local corosync node name > ----- > > This is ok, and also happenned with version 1.1.8. > > At corosync level, all seems ok: > ---- > Jun 27 11:51:18 turifel corosync[6725]: [TOTEM ] A processor joined or > left the membership and a new membership (10.9.93.35:1184) was formed. > Jun 27 11:51:18 turifel corosync[6725]: [QUORUM] Members[2]: 168385827 > 168385835 > Jun 27 11:51:18 turifel corosync[6725]: [MAIN ] Completed service > synchronization, ready to provide service. > Jun 27 11:51:18 turifel crmd[19526]: notice: crm_update_peer_state: > pcmk_quorum_notification: Node selavi[168385827] - state is now member > (was lost) > ------- > > But when starting pacemaker on selavi (the new node), turifel log shows > this: > > ---- > Jun 27 11:54:28 turifel crmd[19526]: notice: do_state_transition: > State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN > cause=C_FSA_INTERNAL origin=peer_update_callback ] > Jun 27 11:54:28 turifel crmd[19526]: warning: crm_get_peer: Node > 'selavi' and 'selavi' share the same cluster nodeid: 168385827 > Jun 27 11:54:28 turifel crmd[19526]: warning: crmd_cs_dispatch: > Recieving messages from a node we think is dead: selavi[0] > Jun 27 11:54:29 turifel crmd[19526]: warning: crm_get_peer: Node > 'selavi' and 'selavi' share the same cluster nodeid: 168385827 > Jun 27 11:54:29 turifel crmd[19526]: warning: do_state_transition: Only > 1 of 2 cluster nodes are eligible to run resources - continue 0 > Jun 27 11:54:29 turifel attrd[19524]: notice: attrd_local_callback: > Sending full refresh (origin=crmd) > ---- > > And selavi remains on pending state. Some times turifel (DC) fences > selavi, but other times remains pending forever. > > On turifel node, all resources gives warnings like this one: > warning: custom_action: Action p_drbd_ha0:0_monitor_0 on selavi is > unrunnable (pending) > > On both nodes, uname -n and crm_node -n gives correct node names (selavi > and turifel respectively) > > ¿Do you think it's a configuration problem? > > > Below I give information about versions and configurations. > > Best regards, > Bernardo. > > > ----- > Versions (git/hg compiled versions): > > corosync: 2.3.0.66-615d > pacemaker: 1.1.9-61e4b8f > cluster-glue: 1.0.11 > libqb: 0.14.4.43-bb4c3 > resource-agents: 3.9.5.98-3b051 > crmsh: 1.2.5 > > Cluster also has drbd, dlm and gfs2, but I think versions are unrelevant > here. > > -------- > Output of pacemaker configuration: > ./configure --prefix=/opt/ha --without-cman \ > --without-heartbeat --with-corosync \ > --enable-fatal-warnings=no --with-lcrso-dir=/opt/ha/libexec/lcrso > > pacemaker configuration: > Version = 1.1.9 (Build: 61e4b8f) > Features = generated-manpages ascii-docs ncurses > libqb-logging libqb-ipc lha-fencing upstart nagios corosync-native snmp > libesmtp > > Prefix = /opt/ha > Executables = /opt/ha/sbin > Man pages = /opt/ha/share/man > Libraries = /opt/ha/lib > Header files = /opt/ha/include > Arch-independent files = /opt/ha/share > State information = /opt/ha/var > System configuration = /opt/ha/etc > Corosync Plugins = /opt/ha/lib > > Use system LTDL = yes > > HA group name = haclient > HA user name = hacluster > > CFLAGS = -I/opt/ha/include -I/opt/ha/include > -I/opt/ha/include/heartbeat -I/opt/ha/include -I/opt/ha/include > -ggdb -fgnu89-inline -fstack-protector-all -Wall -Waggregate-return > -Wbad-function-cast -Wcast-align -Wdeclaration-after-statement > -Wendif-labels -Wfloat-equal -Wformat=2 -Wformat-security > -Wformat-nonliteral -Wmissing-prototypes -Wmissing-declarations > -Wnested-externs -Wno-long-long -Wno-strict-aliasing > -Wunused-but-set-variable -Wpointer-arith -Wstrict-prototypes > -Wwrite-strings > Libraries = -lgnutls -lcorosync_common -lplumb -lpils > -lqb -lbz2 -lxslt -lxml2 -lc -luuid -lpam -lrt -ldl -lglib-2.0 -lltdl > -L/opt/ha/lib -lqb -ldl -lrt -lpthread > Stack Libraries = -L/opt/ha/lib -lqb -ldl -lrt -lpthread > -L/opt/ha/lib -lcpg -L/opt/ha/lib -lcfg -L/opt/ha/lib -lcmap > -L/opt/ha/lib -lquorum > > ---- > Corosync config: > > totem { > version: 2 > crypto_cipher: none > crypto_hash: none > cluster_name: fiestaha > interface { > ringnumber: 0 > ttl: 1 > bindnetaddr: 10.9.93.0 > mcastaddr: 226.94.1.1 > mcastport: 5405 > } > } > logging { > fileline: off > to_stderr: yes > to_logfile: no > to_syslog: yes > syslog_facility: local7 > debug: off > timestamp: on > logger_subsys { > subsys: QUORUM > debug: off > } > } > quorum { > provider: corosync_votequorum > expected_votes: 2 > two_node: 1 > wait_for_all: 0 > } > > > > > > > > > > > > > -- > APSL > *Bernardo Cabezas Serra* > *Responsable Sistemas* > Camí Vell de Bunyola 37, esc. A, local 7 > 07009 Polígono de Son Castelló, Palma > Mail: bcabe...@apsl.net > Skype: bernat.cabezas > Tel: 971439771 > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org