I set up a cluster of two servers CentOS 5.4 x86_64 with pacemaker 1.06 and corosync 1.1.2

I only installed the x86_64 packages (yum install pacemaker try to install also the 32 bits one).

I configured a shared cluster IP (it's a public ip) and a cluster website.

Everything work fine if i try to stop corosync on one of the two servers (the services pass from one machine to the other without problems), but if I reboot one server, when it returns alive it cannot go online in the cluster. I also noticed that there are several thread of corosync and if I kill all of them and then I start again corosync, everything work fine again.

I don't know what is happening and I'm not able to reproduce the same situation on some virtual servers!

Thanks,
Giovanni



the configuration of corosync is the following:

##############################################
# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
# Run as root - this is necessary to be able to manage resources with Pacemaker
        user:   root
        group:  root
}

service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       0
        name:      pacemaker
        use_mgmtd: yes
        use_logd:  yes
}

totem {
        version: 2

        # How long before declaring a token lost (ms)
        token:          5000

        # How many token retransmits before forming a new configuration
        token_retransmits_before_loss_const: 10

        # How long to wait for join messages in the membership protocol (ms)
        join:           1000

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
        consensus:      2500

        # Turn off the virtual synchrony filter
        vsftype:        none

# Number of messages that may be sent by one processor on receipt of the token
        max_messages:   20

        # Stagger sending the node join messages by 1..send_join ms
        send_join: 45

        # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes

        # Disable encryption
        secauth:        off

        # How many threads to use for encryption/decryption
        threads:        0

        # Optionally assign a fixed node id (integer)
        # nodeid:         1234

        interface {
                ringnumber: 0

                # The following values need to be set based on your environment
bindnetaddr: XXX.XXX.XXX.0 #here I put the right ip for my configuration
mcastaddr: 226.94.1.1
mcastport: 4000
        }
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: yes
        to_syslog: yes
        logfile: /tmp/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}

##################################################



_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to