Hi,

I was running the old rpms from the opensuse repo and wanted to change over to the latest packages from the clusterlabs repo in my RHEL 5.5 machines.

Steps I took
1. Disabled the old repo
2. Set the nodes to standby (two node drbd cluster) and turned of openais
3. Enabled the new repo.
4. Performed an update with yum -y update which replaced all packages.
5. The configuration file for ais was renamed openais.conf.rpmsave
6. I ran corosync-keygen and copied the key to the second machine
7. I copied the file openais.conf.rpmsave to /etc/corosync/corosync.conf and modified it by removing the service section and moving that to /etc/corosync/service.d/pcmk
8. I copied the configurations to the other machine.
9. When I try to start either openais or corosync with the init scripts I get a failure and nothing that can really point me to an error in the logs.

Updated packages:
May 26 14:29:32 Updated: cluster-glue-libs-1.0.5-1.el5.x86_64
May 26 14:29:32 Updated: resource-agents-1.0.3-2.el5.x86_64
May 26 14:29:34 Updated: cluster-glue-1.0.5-1.el5.x86_64
May 26 14:29:34 Installed: libibverbs-1.1.3-2.el5.x86_64
May 26 14:29:34 Installed: corosync-1.2.2-1.1.el5.x86_64
May 26 14:29:34 Installed: librdmacm-1.0.10-1.el5.x86_64
May 26 14:29:34 Installed: corosynclib-1.2.2-1.1.el5.x86_64
May 26 14:29:34 Installed: openaislib-1.1.0-2.el5.x86_64
May 26 14:29:34 Updated: openais-1.1.0-2.el5.x86_64
May 26 14:29:34 Installed: libnes-0.9.0-2.el5.x86_64
May 26 14:29:35 Installed: heartbeat-libs-3.0.3-2.el5.x86_64
May 26 14:29:35 Updated: pacemaker-libs-1.0.8-6.1.el5.x86_64
May 26 14:29:36 Updated: heartbeat-3.0.3-2.el5.x86_64
May 26 14:29:36 Updated: pacemaker-1.0.8-6.1.el5.x86_64

Apparently corosync is sec faulting when run from the command line:

# /usr/sbin/corosync -f
Segmentation fault

Any help would be greatly appreciated.

Diego
May 27 08:36:22 phys-ha01 corosync[32243]:   [MAIN  ] Corosync Cluster Engine 
('1.2.2'): started and ready to provide service.
May 27 08:36:22 phys-ha01 corosync[32243]:   [MAIN  ] Corosync built-in 
features: nss rdma
May 27 08:36:22 phys-ha01 corosync[32243]:   [MAIN  ] Successfully read main 
configuration file '/etc/corosync/corosync.conf'.
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Token Timeout (3000 ms) 
retransmit timeout (294 ms)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] token hold (225 ms) 
retransmits before loss (10 retrans)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] join (1000 ms) send_join 
(0 ms) consensus (7500 ms) merge (200 ms)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] downcheck (1000 ms) fail 
to recv const (50 msgs)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] seqno unchanged const (30 
rotations) Maximum network MTU 1402
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] window size per rotation 
(50 messages) maximum messages per rotation (20 messages)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] send threads (0 threads)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] RRP token expired timeout 
(294 ms)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] RRP token problem counter 
(2000 ms)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] RRP threshold (10 problem 
count)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] RRP mode set to none.
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] 
heartbeat_failures_allowed (0)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] max_network_delay (50 ms)
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] HeartBeat is Disabled. To 
enable set heartbeat_failures_allowed > 0
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Initializing transport 
(UDP/IP).
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Initializing 
transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 27 08:36:22 phys-ha01 corosync[32243]:   [IPC   ] you are using ipc api v2
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Receive multicast socket 
recv buffer size (262142 bytes).
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Transmit multicast socket 
send buffer size (262142 bytes).
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] The network interface 
[10.0.0.16] is now up.
May 27 08:36:22 phys-ha01 corosync[32243]:   [TOTEM ] Created or loaded 
sequence id 4.10.0.0.16 for this ring.
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: process_ais_conf: 
Reading configure
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: config_find_init: 
Local handle: 5650605097994944514 for logging
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: config_find_next: 
Processing additional logging options...
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Found 'on' for option: debug
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'off' for option: to_file
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Found 'yes' for option: to_syslog
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Found 'local6' for option: syslog_facility
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: config_find_init: 
Local handle: 2730409743423111171 for service
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: config_find_next: 
Processing additional service options...
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'pcmk' for option: clustername
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_logd
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_mgmtd
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: pcmk_startup: CRM: 
Initialized
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] Logging: Initialized 
pcmk_startup
May 27 08:36:22 phys-ha01 corosync[32243]:   [pcmk  ] info: pcmk_startup: 
Maximum core file size is: 18446744073709551615
aisexec {
        # Run as root - this is necessary to be able to manage resources with 
Pacemaker
        user:   root
        group:  root
}

totem {
        version: 2
        # How long before declaring a token lost (ms)
        token:          3000
        # How many token retransmits before forming a new configuration
        token_retransmits_before_loss_const: 10
        # How long to wait for join messages in the membership protocol (ms)
        join:           1000
        # How long to wait for consensus to be achieved before starting a new 
round of membership configuration (ms)
        consensus:      7500
        # Turn off the virtual synchrony filter
        vsftype:        none
        # Number of messages that may be sent by one processor on receipt of 
the token
        max_messages:   20
        # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes
        # Enable encryption
        secauth:        on
        # How many threads to use for encryption/decryption
        threads:        0
        # Optionally assign a fixed node id (integer)
        # nodeid:         1234
        #rrp_mode: passive

        interface {
                ringnumber: 0
                bindnetaddr: 10.0.0.0
                mcastaddr: 226.94.0.1
                mcastport: 5405
        }
}

logging {
        debug: on
        fileline: off
        to_syslog: yes
        to_stderr: no
        syslog_facility: local6
        timestamp: on
        logger {
                ident: CRM
                debug:on
                tags: enter|leave|trace1|trace2|trace3|trace4|trace6
                fileline: off
        }
}

amf {
        mode: disabled
}
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Reply via email to