Re: [Linux-HA] Adding a node to HA-Cluster without service interruption

Andrew Beekhof Tue, 28 Jul 2009 01:20:29 -0700

could be a bug in 2.1.3, i'd suggest upgrading to pacemaker


On Fri, Jul 24, 2009 at 5:14 PM, Jiayin Mao<[email protected]> wrote:
> There are two nodes in the cluster, and I shutdown haproxy1, and start a new
> node and name it as haproxy1 and set its uuid to the same as the dead
> haproxy1. Here's the log from haproxy2, the ever living node:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> cib[2924]: 2009/07/24_10:55:25 info: cib_ccm_msg_callback: LOST:
> haproxy1.b.cel.brightcove.com
> ccm[2923]: 2009/07/24_10:55:25 info: Break tie for 2 nodes cluster
> cib[2924]: 2009/07/24_10:55:25 info: cib_ccm_msg_callback: PEER:
> haproxy2.b.cel.brightcove.com
> crmd[2928]: 2009/07/24_10:55:25 info: ccm_event_detail: NEW MEMBERSHIP:
> trans=3, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3
> crmd[2928]: 2009/07/24_10:55:25 info: ccm_event_detail:         CURRENT:
> haproxy2.b.cel.brightcove.com [nodeid=1, born=3]
> crmd[2928]: 2009/07/24_10:55:25 info: ccm_event_detail:         LOST:
> haproxy1.b.cel.brightcove.com [nodeid=0, born=2]
> crmd[2928]: 2009/07/24_10:55:25 info: do_state_transition: State transition
> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE
> origin=route_message ]
> heartbeat[2869]: 2009/07/24_10:55:56 WARN: node
> haproxy1.b.cel.brightcove.com: is dead
> heartbeat[2869]: 2009/07/24_10:55:56 info: Link
> haproxy1.b.cel.brightcove.com:eth0 dead.
> crmd[2928]: 2009/07/24_10:55:56 notice: crmd_ha_status_callback: Status
> update: Node haproxy1.b.cel.brightcove.com now has status [dead]
> logd[2847]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped around
> (uptime).
> logd[2853]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped around
> (uptime).
> ccm[2923]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped around
> (uptime).
> heartbeat[2869]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> attrd[2927]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> stonithd[2926]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> crmd[2928]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped around
> (uptime).
> mgmtd[2929]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> tengine[2988]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> pengine[2989]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> heartbeat[2876]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped
> around (uptime).
> lrmd[2925]: 2009/07/24_10:56:13 info: time_longclock: clock_t wrapped around
> (uptime).
> heartbeat[2877]: 2009/07/24_10:56:14 info: time_longclock: clock_t wrapped
> around (uptime).
> heartbeat[2879]: 2009/07/24_10:56:14 info: time_longclock: clock_t wrapped
> around (uptime).
> cib[2924]: 2009/07/24_10:56:14 info: time_longclock: clock_t wrapped around
> (uptime).
> cib[2924]: 2009/07/24_11:03:40 info: cib_stats: Processed 71 operations
> (704.00us average, 0% utilization) in the last 10min
> heartbeat[2880]: 2009/07/24_11:04:18 info: time_longclock: clock_t wrapped
> around (uptime).
> heartbeat[2869]: 2009/07/24_11:04:18 info: Heartbeat restart on node
> haproxy1.b.cel.brightcove.com
> heartbeat[2869]: 2009/07/24_11:04:18 info: Link
> haproxy1.b.cel.brightcove.com:eth0 up.
> heartbeat[2869]: 2009/07/24_11:04:18 info: Status update for node
> haproxy1.b.cel.brightcove.com: status init
> heartbeat[2869]: 2009/07/24_11:04:18 info: Status update for node
> haproxy1.b.cel.brightcove.com: status up
> crmd[2928]: 2009/07/24_11:04:18 notice: crmd_ha_status_callback: Status
> update: Node haproxy1.b.cel.brightcove.com now has status [init]
> crmd[2928]: 2009/07/24_11:04:18 notice: crmd_ha_status_callback: Status
> update: Node haproxy1.b.cel.brightcove.com now has status [up]
> heartbeat[2869]: 2009/07/24_11:04:18 info: all clients are now paused
> heartbeat[2869]: 2009/07/24_11:04:47 info: Status update for node
> haproxy1.b.cel.brightcove.com: status active
> crmd[2928]: 2009/07/24_11:04:47 notice: crmd_ha_status_callback: Status
> update: Node haproxy1.b.cel.brightcove.com now has status [active]
> cib[2924]: 2009/07/24_11:04:47 info: cib_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/cib now has status [join]
> heartbeat[2869]: 2009/07/24_11:04:51 WARN: 1 lost packet(s) for [
> haproxy1.b.cel.brightcove.com] [58:60]
> heartbeat[2869]: 2009/07/24_11:04:51 info: No pkts missing from
> haproxy1.b.cel.brightcove.com!
> crmd[2928]: 2009/07/24_11:04:51 notice: crmd_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/crmd now has status [online]
> heartbeat[2869]: 2009/07/24_11:04:52 WARN: 1 lost packet(s) for [
> haproxy1.b.cel.brightcove.com] [62:64]
> heartbeat[2869]: 2009/07/24_11:04:52 info: No pkts missing from
> haproxy1.b.cel.brightcove.com!
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_slave_all message (66) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (68) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (69) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_replace message (6b) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (6d) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:26 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (6e) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:27 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (71) from haproxy1.b.cel.brightcove.com: not in our
> membership
> cib[2924]: 2009/07/24_11:05:27 WARN: cib_peer_callback: Discarding
> cib_apply_diff message (72) from haproxy1.b.cel.brightcove.com: not in our
> membership
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
> Here is log from the new haproxy1:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> logd[3213]: 2009/07/24_11:04:17 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> logd[3208]: 2009/07/24_11:04:17 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> heartbeat[3229]: 2009/07/24_11:04:17 info: Enabling logging daemon
> heartbeat[3229]: 2009/07/24_11:04:17 info: logfile and debug file are those
> specified in logd config file (default /etc/logd.cf)
> heartbeat[3229]: 2009/07/24_11:04:17 info: Version 2 support: on
> heartbeat[3229]: 2009/07/24_11:04:17 WARN: logd is enabled but
> logfile/debugfile/logfacility is still configured in ha.cf
> heartbeat[3229]: 2009/07/24_11:04:17 info: **************************
> heartbeat[3229]: 2009/07/24_11:04:17 info: Configuration validated. Starting
> heartbeat 2.1.3
> heartbeat[3230]: 2009/07/24_11:04:17 info: heartbeat: version 2.1.3
> heartbeat[3230]: 2009/07/24_11:04:17 WARN: No Previous generation - starting
> at 1248447858
> heartbeat[3230]: 2009/07/24_11:04:17 info: Heartbeat generation: 1248447858
> heartbeat[3230]: 2009/07/24_11:04:17 info: Creating FIFO
> /var/lib/heartbeat/fifo.
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: write socket
> priority set to IPTOS_LOWDELAY on eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: bound send socket to
> device: eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: bound receive socket
> to device: eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: started on port 694
> interface eth0 to 10.254.166.36
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: write socket
> priority set to IPTOS_LOWDELAY on eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: bound send socket to
> device: eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: bound receive socket
> to device: eth0
> heartbeat[3230]: 2009/07/24_11:04:17 info: glib: ucast: started on port 694
> interface eth0 to 10.255.122.48
> heartbeat[3230]: 2009/07/24_11:04:17 info: G_main_add_TriggerHandler: Added
> signal manual handler
> heartbeat[3230]: 2009/07/24_11:04:17 info: G_main_add_TriggerHandler: Added
> signal manual handler
> heartbeat[3230]: 2009/07/24_11:04:17 info: G_main_add_SignalHandler: Added
> signal handler for signal 17
> heartbeat[3230]: 2009/07/24_11:04:17 info: Local status now set to: 'up'
> heartbeat[3230]: 2009/07/24_11:04:47 WARN: node
> haproxy2.b.cel.brightcove.com: is dead
> heartbeat[3230]: 2009/07/24_11:04:47 info: Comm_now_up(): updating status to
> active
> heartbeat[3230]: 2009/07/24_11:04:47 info: Local status now set to: 'active'
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/ccm" (498,496)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/cib" (498,496)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/lrmd -r" (0,0)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/stonithd" (0,0)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/attrd" (498,496)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/crmd" (498,496)
> heartbeat[3230]: 2009/07/24_11:04:47 info: Starting child client
> "/usr/lib/heartbeat/mgmtd -v" (0,0)
> heartbeat[3265]: 2009/07/24_11:04:47 info: Starting "/usr/lib/heartbeat/ccm"
> as uid 498  gid 496 (pid 3265)
> heartbeat[3266]: 2009/07/24_11:04:47 info: Starting "/usr/lib/heartbeat/cib"
> as uid 498  gid 496 (pid 3266)
> heartbeat[3267]: 2009/07/24_11:04:47 info: Starting "/usr/lib/heartbeat/lrmd
> -r" as uid 0  gid 0 (pid 3267)
> heartbeat[3268]: 2009/07/24_11:04:47 info: Starting
> "/usr/lib/heartbeat/stonithd" as uid 0  gid 0 (pid 3268)
> heartbeat[3269]: 2009/07/24_11:04:47 info: Starting
> "/usr/lib/heartbeat/attrd" as uid 498  gid 496 (pid 3269)
> heartbeat[3270]: 2009/07/24_11:04:47 info: Starting
> "/usr/lib/heartbeat/crmd" as uid 498  gid 496 (pid 3270)
> heartbeat[3271]: 2009/07/24_11:04:47 info: Starting
> "/usr/lib/heartbeat/mgmtd -v" as uid 0  gid 0 (pid 3271)
> lrmd[3267]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> stonithd[3268]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 10
> stonithd[3268]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 12
> ccm[3265]: 2009/07/24_11:04:47 info: Hostname: haproxy1.b.cel.brightcove.com
> stonithd[3268]: 2009/07/24_11:04:47 info: Signing in with heartbeat.
> attrd[3269]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 15
> attrd[3269]: 2009/07/24_11:04:47 info: register_with_ha: Hostname:
> haproxy1.b.cel.brightcove.com
> stonithd[3268]: 2009/07/24_11:04:47 notice: /usr/lib/heartbeat/stonithd
> start up successfully.
> stonithd[3268]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 17
> crmd[3270]: 2009/07/24_11:04:47 info: main: CRM Hg Version: node:
> 552305612591183b1628baa5bc6e903e0f1e26a3
>
> crmd[3270]: 2009/07/24_11:04:47 info: crmd_init: Starting crmd
> crmd[3270]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> crmd[3270]: 2009/07/24_11:04:47 info: G_main_add_TriggerHandler: Added
> signal manual handler
> crmd[3270]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 17
> lrmd[3267]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 17
> lrmd[3267]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 10
> lrmd[3267]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 12
> lrmd[3267]: 2009/07/24_11:04:47 info: Started.
> mgmtd[3271]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 15
> mgmtd[3271]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 10
> mgmtd[3271]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added
> signal handler for signal 12
> attrd[3269]: 2009/07/24_11:04:47 info: register_with_ha: UUID:
> 7285647b-01fe-48b3-adc2-b12250b98c96
> mgmtd[3271]: 2009/07/24_11:04:47 info: init_crm
> mgmtd[3271]: 2009/07/24_11:04:47 info: login to cib: 0, ret:-10
> cib[3266]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> cib[3266]: 2009/07/24_11:04:47 info: G_main_add_TriggerHandler: Added signal
> manual handler
> cib[3266]: 2009/07/24_11:04:47 info: G_main_add_SignalHandler: Added signal
> handler for signal 17
> cib[3266]: 2009/07/24_11:04:47 info: main: Retrieval of a per-action CIB:
> disabled
> cib[3266]: 2009/07/24_11:04:47 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3266]: 2009/07/24_11:04:47 WARN: retrieveCib: Cluster configuration not
> found: /var/lib/heartbeat/crm/cib.xml
> cib[3266]: 2009/07/24_11:04:47 WARN: readCibXmlFile: Primary configuration
> corrupt or unusable, trying backup...
> cib[3266]: 2009/07/24_11:04:47 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> cib[3266]: 2009/07/24_11:04:47 WARN: retrieveCib: Cluster configuration not
> found: /var/lib/heartbeat/crm/cib.xml.last
> cib[3266]: 2009/07/24_11:04:47 WARN: readCibXmlFile: Continuing with an
> empty configuration.
> cib[3266]: 2009/07/24_11:04:47 WARN: readCibXmlFile: No value for
> admin_epoch was specified in the configuration.
> cib[3266]: 2009/07/24_11:04:47 WARN: readCibXmlFile: The reccomended course
> of action is to shutdown, run crm_verify and fix any errors it reports.
> cib[3266]: 2009/07/24_11:04:47 WARN: readCibXmlFile: We will default to zero
> and continue but may get confused about which configuration to use if
> multiple nodes are powered up at the same time.
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk] <cib generated="true" admin_epoch="0" epoch="0" num_updates="0"
> have_quorum="false">
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]   <configuration>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]     <crm_config/>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]     <nodes/>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]     <resources/>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]     <constraints/>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]   </configuration>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk]   <status/>
> cib[3266]: 2009/07/24_11:04:47 info: log_data_element: readCibXmlFile:
> [on-disk] </cib>
> cib[3266]: 2009/07/24_11:04:47 notice: readCibXmlFile: Enabling DTD
> validation on the existing (sane) configuration
> cib[3266]: 2009/07/24_11:04:47 info: startCib: CIB Initialization completed
> successfully
> cib[3266]: 2009/07/24_11:04:47 info: cib_register_ha: Signing in with
> Heartbeat
> cib[3266]: 2009/07/24_11:04:47 info: cib_register_ha: FSA Hostname:
> haproxy1.b.cel.brightcove.com
> cib[3266]: 2009/07/24_11:04:47 info: ccm_connect: Registering with CCM...
> cib[3266]: 2009/07/24_11:04:47 WARN: ccm_connect: CCM Activation failed
> cib[3266]: 2009/07/24_11:04:47 WARN: ccm_connect: CCM Connection failed 1
> times (30 max)
> ccm[3265]: 2009/07/24_11:04:50 info: Break tie for 2 nodes cluster
> ccm[3265]: 2009/07/24_11:04:50 info: G_main_add_SignalHandler: Added signal
> handler for signal 15
> cib[3266]: 2009/07/24_11:04:50 info: ccm_connect: Registering with CCM...
> cib[3266]: 2009/07/24_11:04:51 info: cib_init: Starting cib mainloop
> cib[3266]: 2009/07/24_11:04:51 info: mem_handle_event: Got an event
> OC_EV_MS_NEW_MEMBERSHIP from ccm
> cib[3266]: 2009/07/24_11:04:51 info: mem_handle_event: instance=1, nodes=1,
> new=1, lost=0, n_idx=0, new_idx=0, old_idx=3
> cib[3266]: 2009/07/24_11:04:51 info: cib_ccm_msg_callback: PEER:
> haproxy1.b.cel.brightcove.com
> crmd[3270]: 2009/07/24_11:04:51 info: do_cib_control: CIB connection
> established
> crmd[3270]: 2009/07/24_11:04:51 info: register_with_ha: Hostname:
> haproxy1.b.cel.brightcove.com
> cib[3266]: 2009/07/24_11:04:51 info: cib_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/cib now has status [join]
> cib[3266]: 2009/07/24_11:04:51 info: cib_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/cib now has status [online]
> cib[3266]: 2009/07/24_11:04:51 info: cib_null_callback: Setting
> cib_refresh_notify callbacks for crmd: on
> cib[3266]: 2009/07/24_11:04:51 info: cib_null_callback: Setting
> cib_diff_notify callbacks for mgmtd: on
> crmd[3270]: 2009/07/24_11:04:51 info: register_with_ha: UUID:
> 7285647b-01fe-48b3-adc2-b12250b98c96
> cib[3272]: 2009/07/24_11:04:52 info: write_cib_contents: Wrote version 0.0.0
> of the CIB to disk (digest: 61470affa950eb34c520f6117d4966a5)
> crmd[3270]: 2009/07/24_11:04:52 info: populate_cib_nodes: Requesting the
> list of configured nodes
> cib[3272]: 2009/07/24_11:04:52 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> mgmtd[3271]: 2009/07/24_11:04:52 info: Started.
> crmd[3270]: 2009/07/24_11:04:53 WARN: get_uuid: Could not calculate UUID for
> haproxy2.b.cel.brightcove.com
> crmd[3270]: 2009/07/24_11:04:53 WARN: populate_cib_nodes: Node
> haproxy2.b.cel.brightcove.com: no uuid found
> crmd[3270]: 2009/07/24_11:04:53 notice: populate_cib_nodes: Node:
> haproxy1.b.cel.brightcove.com (uuid: 7285647b-01fe-48b3-adc2-b12250b98c96)
> crmd[3270]: 2009/07/24_11:04:53 info: do_ha_control: Connected to Heartbeat
> crmd[3270]: 2009/07/24_11:04:53 info: do_ccm_control: CCM connection
> established... waiting for first callback
> crmd[3270]: 2009/07/24_11:04:53 info: do_started: Delaying start, CCM
> (0000000000100000) not connected
> crmd[3270]: 2009/07/24_11:04:53 info: crmd_init: Starting crmd's mainloop
> crmd[3270]: 2009/07/24_11:04:53 notice: crmd_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/crmd now has status [online]
> cib[3273]: 2009/07/24_11:04:53 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3273]: 2009/07/24_11:04:53 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3273]: 2009/07/24_11:04:53 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> cib[3273]: 2009/07/24_11:04:53 info: write_cib_contents: Wrote version 0.0.0
> of the CIB to disk (digest: 9b863505d9ee412f3320c664be001f0a)
> cib[3273]: 2009/07/24_11:04:53 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3273]: 2009/07/24_11:04:53 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> crmd[3270]: 2009/07/24_11:04:54 notice: crmd_client_status_callback: Status
> update: Client haproxy1.b.cel.brightcove.com/crmd now has status [online]
> crmd[3270]: 2009/07/24_11:04:54 info: do_started: Delaying start, CCM
> (0000000000100000) not connected
> crmd[3270]: 2009/07/24_11:04:54 info: mem_handle_event: Got an event
> OC_EV_MS_NEW_MEMBERSHIP from ccm
> crmd[3270]: 2009/07/24_11:04:54 info: mem_handle_event: instance=1, nodes=1,
> new=1, lost=0, n_idx=0, new_idx=0, old_idx=3
> crmd[3270]: 2009/07/24_11:04:54 info: crmd_ccm_msg_callback: Quorum
> (re)attained after event=NEW MEMBERSHIP (id=1)
> crmd[3270]: 2009/07/24_11:04:54 info: ccm_event_detail: NEW MEMBERSHIP:
> trans=1, nodes=1, new=1, lost=0 n_idx=0, new_idx=0, old_idx=3
> crmd[3270]: 2009/07/24_11:04:54 info: ccm_event_detail:         CURRENT:
> haproxy1.b.cel.brightcove.com [nodeid=0, born=1]
> crmd[3270]: 2009/07/24_11:04:54 info: ccm_event_detail:         NEW:
> haproxy1.b.cel.brightcove.com [nodeid=0, born=1]
> crmd[3270]: 2009/07/24_11:04:54 info: do_started: The local CRM is
> operational
> crmd[3270]: 2009/07/24_11:04:54 info: do_state_transition: State transition
> S_STARTING -> S_PENDING [ input=I_PENDING cause=C_CCM_CALLBACK
> origin=do_started ]
> attrd[3269]: 2009/07/24_11:04:57 info: main: Starting mainloop...
> crmd[3270]: 2009/07/24_11:05:25 info: crm_timer_popped: Election Trigger
> (I_DC_TIMEOUT) just popped!
> crmd[3270]: 2009/07/24_11:05:25 WARN: do_log: [[FSA]] Input I_DC_TIMEOUT
> from crm_timer_popped() received in state (S_PENDING)
> crmd[3270]: 2009/07/24_11:05:25 info: do_state_transition: State transition
> S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
> origin=crm_timer_popped ]
> crmd[3270]: 2009/07/24_11:05:25 info: do_election_count_vote: Updated voted
> hash for haproxy1.b.cel.brightcove.com to vote
> crmd[3270]: 2009/07/24_11:05:25 info: do_election_count_vote: Election
> ignore: our vote (haproxy1.b.cel.brightcove.com)
> crmd[3270]: 2009/07/24_11:05:25 info: do_state_transition: State transition
> S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL
> origin=do_election_check ]
> crmd[3270]: 2009/07/24_11:05:25 info: start_subsystem: Starting sub-system
> "tengine"
> crmd[3270]: 2009/07/24_11:05:25 info: start_subsystem: Starting sub-system
> "pengine"
> cib[3266]: 2009/07/24_11:05:25 info: cib_process_readwrite: We are now in
> R/W mode
> crmd[3270]: 2009/07/24_11:05:25 info: do_dc_takeover: Taking over DC status
> for this partition
> cib[3266]: 2009/07/24_11:05:25 info: revision_check: Updating CIB revision
> to 2.0
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: - <cib
> epoch="0"/>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: + <cib
> epoch="1">
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> <configuration>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> <crm_config>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> <cluster_property_set id="cib-bootstrap-options">
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> <attributes>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> </attributes>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> </cluster_property_set>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> </crm_config>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: +
> </configuration>
> cib[3266]: 2009/07/24_11:05:25 info: log_data_element: cib:diff: + </cib>
> cib[3310]: 2009/07/24_11:05:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3310]: 2009/07/24_11:05:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3310]: 2009/07/24_11:05:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> crmd[3270]: 2009/07/24_11:05:25 info: join_make_offer: Making join offers
> based on membership 1
> crmd[3270]: 2009/07/24_11:05:25 info: do_dc_join_offer_all: join-1: Waiting
> on 1 outstanding join acks
> tengine[3308]: 2009/07/24_11:05:25 info: G_main_add_SignalHandler: Added
> signal handler for signal 15
> tengine[3308]: 2009/07/24_11:05:25 info: G_main_add_TriggerHandler: Added
> signal manual handler
> tengine[3308]: 2009/07/24_11:05:25 info: G_main_add_TriggerHandler: Added
> signal manual handler
> cib[3266]: 2009/07/24_11:05:25 info: cib_null_callback: Setting
> cib_diff_notify callbacks for tengine: on
> tengine[3308]: 2009/07/24_11:05:25 info: te_init: Registering TE UUID:
> 649f4c3c-a203-488e-bc62-91a34106a8cb
> tengine[3308]: 2009/07/24_11:05:25 info: set_graph_functions: Setting custom
> graph functions
> tengine[3308]: 2009/07/24_11:05:25 info: unpack_graph: Unpacked transition
> -1: 0 actions in 0 synapses
> tengine[3308]: 2009/07/24_11:05:25 info: te_init: Starting tengine
> tengine[3308]: 2009/07/24_11:05:25 info: te_connect_stonith: Attempting
> connection to fencing daemon...
> pengine[3309]: 2009/07/24_11:05:25 info: G_main_add_SignalHandler: Added
> signal handler for signal 15
> pengine[3309]: 2009/07/24_11:05:25 info: pe_init: Starting pengine
> cib[3310]: 2009/07/24_11:05:25 info: write_cib_contents: Wrote version 0.1.1
> of the CIB to disk (digest: e0890b65fda4792e35d6a3c8ba7b6da2)
> cib[3310]: 2009/07/24_11:05:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> cib[3310]: 2009/07/24_11:05:25 info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
> /var/lib/heartbeat/crm/cib.xml.sig.last)
> crmd[3270]: 2009/07/24_11:05:26 info: update_dc: Set DC to
> haproxy1.b.cel.brightcove.com (2.0)
> crmd[3270]: 2009/07/24_11:05:26 info: do_state_transition: State transition
> S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL
> origin=check_join_state ]
> crmd[3270]: 2009/07/24_11:05:26 info: do_state_transition: All 1 cluster
> nodes responded to the join offer.
> crmd[3270]: 2009/07/24_11:05:26 info: update_attrd: Connecting to attrd...
> attrd[3269]: 2009/07/24_11:05:26 info: attrd_local_callback: Sending full
> refresh
> cib[3266]: 2009/07/24_11:05:26 info: sync_our_cib: Syncing CIB to all peers
> crmd[3270]: 2009/07/24_11:05:26 info: update_dc: Set DC to
> haproxy1.b.cel.brightcove.com (2.0)
> tengine[3308]: 2009/07/24_11:05:26 info: te_connect_stonith: Connected
> crmd[3270]: 2009/07/24_11:05:27 info: do_dc_join_ack: join-1: Updating node
> state to member for haproxy1.b.cel.brightcove.com
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL
> origin=check_join_state ]
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: All 1 cluster
> nodes are eligible to run resources.
> tengine[3308]: 2009/07/24_11:05:27 info: update_abort_priority: Abort
> priority upgraded to 1000000
> tengine[3308]: 2009/07/24_11:05:27 info: update_abort_priority: 'DC
> Takeover' abort superceeded
> pengine[3309]: 2009/07/24_11:05:27 info: determine_online_status: Node
> haproxy1.b.cel.brightcove.com is online
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
> cause=C_IPC_MESSAGE origin=route_message ]
> tengine[3308]: 2009/07/24_11:05:27 info: unpack_graph: Unpacked transition
> 0: 1 actions in 1 synapses
> tengine[3308]: 2009/07/24_11:05:27 info: send_rsc_command: Initiating action
> 2: probe_complete on haproxy1.b.cel.brightcove.com
> tengine[3308]: 2009/07/24_11:05:27 info: run_graph: Transition 0:
> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0)
> tengine[3308]: 2009/07/24_11:05:27 info: notify_crmd: Transition 0 status:
> te_complete - <null>
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE
> origin=route_message ]
> tengine[3308]: 2009/07/24_11:05:27 info: extract_event: Aborting on
> transient_attributes changes for 7285647b-01fe-48b3-adc2-b12250b98c96
> tengine[3308]: 2009/07/24_11:05:27 info: update_abort_priority: Abort
> priority upgraded to 1000000
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_IPC_MESSAGE
> origin=route_message ]
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: All 1 cluster
> nodes are eligible to run resources.
> pengine[3309]: 2009/07/24_11:05:27 info: process_pe_message: Transition 0:
> PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-0.bz2
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
> cause=C_IPC_MESSAGE origin=route_message ]
> tengine[3308]: 2009/07/24_11:05:27 info: unpack_graph: Unpacked transition
> 1: 0 actions in 0 synapses
> pengine[3309]: 2009/07/24_11:05:27 info: determine_online_status: Node
> haproxy1.b.cel.brightcove.com is online
> tengine[3308]: 2009/07/24_11:05:27 info: run_graph: Transition 1:
> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0)
> tengine[3308]: 2009/07/24_11:05:27 info: notify_crmd: Transition 1 status:
> te_complete - <null>
> crmd[3270]: 2009/07/24_11:05:27 info: do_state_transition: State transition
> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE
> origin=route_message ]
> pengine[3309]: 2009/07/24_11:05:27 info: process_pe_message: Transition 1:
> PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-1.bz2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
>
> On Fri, Jul 24, 2009 at 10:37 PM, Andrew Beekhof <[email protected]> wrote:
>
>> On Fri, Jul 24, 2009 at 4:34 PM, Jiayin Mao<[email protected]> wrote:
>> > Do you mean "autojoin any"?
>>
>> right
>>
>> > I had it there, but heartbeat still complained
>> > the new node is not in their membership.
>>
>> logs?
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
>
>
> --
> Max Mao (Mao Jia Yin)
> Abao Scrum Team, Engineering Department
> -----------------------------------------------------------
> I am located at Beijing office, and usually work during 9:00PM and 6:00AM
> ET.
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Adding a node to HA-Cluster without service interruption

Reply via email to