Nikola, Sorry, I don't have a solution, but I'm curious about your setup. Which version of DLM are you using? Did you have to compile it yourself?
Regards, Mark On Tue, Nov 10, 2009 at 7:28 AM, Nikola Ciprich <extmaill...@linuxbox.cz> wrote: > Hello Andrew et al, > few days ago, I asked about pacemaker + corosync + clvmd etc. With Your > advice, I got this working well. > It was in testing virtual machines, I'm now trying to install similar setup > on raw hardware but for some > reasong attrd and cib seem to be crashing. > > here's snippet from corosync log: > Nov 10 14:12:21 vbox3 corosync[4299]: [MAIN ] Corosync Cluster Engine > ('1.1.2'): started and ready to provide service. > Nov 10 14:12:21 vbox3 corosync[4299]: [MAIN ] Corosync built-in features: > nss rdma > Nov 10 14:12:21 vbox3 corosync[4299]: [MAIN ] Successfully read main > configuration file '/etc/corosync/corosync.conf'. > Nov 10 14:12:21 vbox3 corosync[4299]: [TOTEM ] Initializing transport > (UDP/IP). > Nov 10 14:12:21 vbox3 corosync[4299]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Nov 10 14:12:21 vbox3 corosync[4299]: [MAIN ] Compatibility mode set to > whitetank. Using V1 and V2 of the synchronization engine. > Nov 10 14:12:21 vbox3 corosync[4299]: [TOTEM ] The network interface > [10.58.0.1] is now up. > Nov 10 14:12:21 vbox3 corosync[4299]: [pcmk ] info: process_ais_conf: > Reading configure > Nov 10 14:13:16 vbox3 corosync[4348]: [MAIN ] Corosync Cluster Engine > ('1.1.2'): started and ready to provide service. > Nov 10 14:13:16 vbox3 corosync[4348]: [MAIN ] Corosync built-in features: > nss rdma > Nov 10 14:13:16 vbox3 corosync[4348]: [MAIN ] Successfully read main > configuration file '/etc/corosync/corosync.conf'. > Nov 10 14:13:16 vbox3 corosync[4348]: [TOTEM ] Initializing transport > (UDP/IP). > Nov 10 14:13:16 vbox3 corosync[4348]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Nov 10 14:13:16 vbox3 corosync[4348]: [MAIN ] Compatibility mode set to > whitetank. Using V1 and V2 of the synchronization engine. > Nov 10 14:13:16 vbox3 corosync[4348]: [TOTEM ] The network interface > [10.58.0.1] is now up. > Nov 10 14:13:16 vbox3 corosync[4348]: [pcmk ] info: process_ais_conf: > Reading configure > Nov 10 14:13:24 vbox3 corosync[4357]: [MAIN ] Corosync Cluster Engine > ('1.1.2'): started and ready to provide service. > Nov 10 14:13:24 vbox3 corosync[4357]: [MAIN ] Corosync built-in features: > nss rdma > Nov 10 14:13:24 vbox3 corosync[4357]: [MAIN ] Successfully read main > configuration file '/etc/corosync/corosync.conf'. > Nov 10 14:13:24 vbox3 corosync[4357]: [TOTEM ] Initializing transport > (UDP/IP). > Nov 10 14:13:24 vbox3 corosync[4357]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Nov 10 14:13:24 vbox3 corosync[4357]: [MAIN ] Compatibility mode set to > whitetank. Using V1 and V2 of the synchronization engine. > Nov 10 14:13:24 vbox3 corosync[4357]: [TOTEM ] The network interface > [10.58.0.1] is now up. > Nov 10 14:13:24 vbox3 corosync[4357]: [pcmk ] info: process_ais_conf: > Reading configure > Nov 10 14:13:57 vbox3 corosync[4380]: [MAIN ] Corosync Cluster Engine > ('1.1.2'): started and ready to provide service. > Nov 10 14:13:57 vbox3 corosync[4380]: [MAIN ] Corosync built-in features: > nss rdma > Nov 10 14:13:57 vbox3 corosync[4380]: [MAIN ] Successfully read main > configuration file '/etc/corosync/corosync.conf'. > Nov 10 14:13:57 vbox3 corosync[4380]: [TOTEM ] Initializing transport > (UDP/IP). > Nov 10 14:13:57 vbox3 corosync[4380]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Nov 10 14:13:57 vbox3 corosync[4380]: [MAIN ] Compatibility mode set to > whitetank. Using V1 and V2 of the synchronization engine. > Nov 10 14:13:58 vbox3 corosync[4380]: [TOTEM ] The network interface > [10.58.0.1] is now up. > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: process_ais_conf: > Reading configure > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: config_find_init: > Local handle: 9213452461992312833 for logging > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: config_find_next: > Processing additional logging options... > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: get_config_opt: Found > 'off' for option: debug > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: get_config_opt: > Defaulting to 'off' for option: to_file > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: get_config_opt: > Defaulting to 'daemon' for option: syslog_facility > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: config_find_init: > Local handle: 2013064636357672962 for service > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: config_find_next: > Processing additional service options... > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: get_config_opt: > Defaulting to 'no' for option: use_logd > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: get_config_opt: Found > 'no' for option: use_mgmtd > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_startup: CRM: > Initialized > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] Logging: Initialized > pcmk_startup > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_startup: Maximum > core file size is: 18446744073709551615 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_startup: Service: > 9 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_startup: Local > hostname: vbox3 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_update_nodeid: > Local node id: 16792074 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: update_member: > Creating entry for node 16792074 born on 0 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: update_member: > 0x260ee10 Node 16792074 now known as vbox3 (was: (null)) > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: update_member: Node > vbox3 now has 1 quorum votes (was 0) > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: update_member: Node > 16792074/vbox3 is now: member > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4384 for process stonithd > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4385 for process cib > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4386 for process lrmd > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4387 for process attrd > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4388 for process pengine > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4389 for process crmd > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > Pacemaker Cluster Manager 1.0.6 > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync extended virtual synchrony service > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync configuration service > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync cluster closed process group service v1.01 > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync cluster config database access v1.01 > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync profile loading service > Nov 10 14:13:58 vbox3 corosync[4380]: [SERV ] Service engine loaded: > corosync cluster quorum service v0.1 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] notice: pcmk_peer_update: > Transitional membership event on ring 4: memb=0, new=0, lost=0 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added > signal handler for signal 10 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added > signal handler for signal 12 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: Stack hogger failed 0xffffffff > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] notice: pcmk_peer_update: > Stable membership event on ring 4: memb=1, new=1, lost=0 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_peer_update: NEW: > vbox3 16792074 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_peer_update: > MEMB: vbox3 16792074 > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: update_member: Node > vbox3 now has process list: 00000000000000000000000000013312 (78610) > Nov 10 14:13:58 vbox3 corosync[4380]: [TOTEM ] A processor joined or left > the membership and a new membership was formed. > Nov 10 14:13:58 vbox3 corosync[4380]: [MAIN ] Completed service > synchronization, ready to provide service. > Nov 10 14:13:58 vbox3 cib: [4385]: info: Invoked: /usr/lib64/heartbeat/cib > Nov 10 14:13:58 vbox3 cib: [4385]: info: G_main_add_TriggerHandler: Added > signal manual handler > Nov 10 14:13:58 vbox3 cib: [4385]: info: G_main_add_SignalHandler: Added > signal handler for signal 17 > Nov 10 14:13:58 vbox3 cib: [4385]: info: retrieveCib: Reading c > Nov 10 14:13:58 vbox3 cib: [4385]: WARN: retrieveCib: Cluster configuration > not found: /var/lib/heartbeat/crm/cib.xml > Nov 10 14:13:58 vbox3 cib: [4385]: WARN: readCibXmlFile: Primary > configuration corrupt or unusable, trying backup... > Nov 10 14:13:58 vbox3 cib: [4385]: WARN: readCibXmlFile: Continuing with an > empty configuration. > Nov 10 14:13:58 vbox3 cib: [4385]: info: startCib: CIB Initialization > completed successfully > Nov 10 14:13:58 vbox3 cib: [4385]: info: crm_cluster_connect: Connecting to > OpenAIS > Nov 10 14:13:58 vbox3 cib: [4385]: info: init_ais_connection: Creating > connection to our AIS plugin > Nov 10 14:13:58 vbox3 crmd: [4389]: info: Invoked: /usr/lib64/heartbeat/crmd > Nov 10 14:13:58 vbox3 crmd: [4389]: info: main: CRM Hg Version: > cebe2b6ff49b36b29a3bd7ada1c4701c7470febe > Nov 10 14:13:58 vbox3 crmd: [4389]: info: crmd_init: Starting crmd > Nov 10 14:13:58 vbox3 crmd: [4389]: info: G_main_add_SignalHandler: Added > signal handler for signal 17 > Nov 10 14:13:58 vbox3 pengine: [4388]: info: Invoked: > /usr/lib64/heartbeat/pengine > Nov 10 14:13:58 vbox3 pengine: [4388]: info: main: Starting pengine > Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added > signal handler for signal 15 > Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added > signal handler for signal 17 > Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added > signal handler for signal 10 > Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added > signal handler for signal 12 > Nov 10 14:13:58 vbox3 lrmd: [4386]: info: Started. > Nov 10 14:13:58 vbox3 attrd: [4387]: info: Invoked: /usr/lib64/heartbeat/attrd > Nov 10 14:13:58 vbox3 attrd: [4387]: info: main: Starting up > Nov 10 14:13:58 vbox3 attrd: [4387]: info: crm_cluster_connect: Connecting to > OpenAIS > Nov 10 14:13:58 vbox3 attrd: [4387]: info: init_ais_connection: Creating > connection to our AIS plugin > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_cluster_connect: Connecting > to OpenAIS > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: init_ais_connection: Creating > connection to our AIS plugin > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: init_ais_connection: AIS > connection established > Nov 10 14:13:58 vbox3 corosync[4380]: [pcmk ] info: pcmk_ipc: Recorded > connection 0x2615120 for stonithd/4384 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: get_ais_nodeid: Server details: > id=16792074 uname=vbox3 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_new_peer: Node vbox3 now > has id: 16792074 > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_new_peer: Node 16792074 is > now known as vbox3 > Nov 10 14:13:58 vbox3 stonithd: [4384]: notice: /usr/lib64/heartbeat/stonithd > start up successfully. > Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added > signal handler for signal 17 > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] ERROR: pcmk_wait_dispatch: > Child process cib terminated with signal 11 (pid=4385, core=false) > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] notice: pcmk_wait_dispatch: > Respawning failed child process: cib > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4391 for process cib > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] ERROR: pcmk_wait_dispatch: > Child process attrd terminated with signal 11 (pid=4387, core=false) > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] notice: pcmk_wait_dispatch: > Respawning failed child process: attrd > Nov 10 14:13:59 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4392 for process attrd > Nov 10 14:13:59 vbox3 crmd: [4389]: info: do_cib_control: Could not connect > to the CIB service: connection failed > Nov 10 14:13:59 vbox3 crmd: [4389]: WARN: do_cib_control: Couldn't complete > CIB registration 1 times... pause and retry > Nov 10 14:13:59 vbox3 crmd: [4389]: info: crmd_init: Starting crmd's mainloop > Nov 10 14:13:59 vbox3 cib: [4391]: info: Invoked: /usr/lib64/heartbeat/cib > Nov 10 14:13:59 vbox3 cib: [4391]: info: G_main_add_TriggerHandler: Added > signal manual handler > Nov 10 14:13:59 vbox3 cib: [4391]: info: G_main_add_SignalHandler: Added > signal handler for signal 17 > Nov 10 14:13:59 vbox3 cib: [4391]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.xml (digest: > /var/lib/heartbeat/ > Nov 10 14:13:59 vbox3 cib: [4391]: WARN: retrieveCib: Cluster configuration > not found: /var/lib/heartbeat/crm/cib.xml > Nov 10 14:13:59 vbox3 cib: [4391]: WARN: readCibXmlFile: Primary > configuration corrupt or unusable, trying backup... > Nov 10 14:13:59 vbox3 cib: [4391]: WARN: readCibXmlFile: Continuing with an > empty configuration. > Nov 10 14:13:59 vbox3 cib: [4391]: info: startCib: CIB Initialization > completed successfully > Nov 10 14:13:59 vbox3 cib: [4391]: info: crm_cluster_connect: Connecting to > OpenAIS > Nov 10 14:13:59 vbox3 cib: [4391]: info: init_ais_connection: Creating > connection to our AIS plugin > Nov 10 14:13:59 vbox3 attrd: [4392]: info: Invoked: /usr/lib64/heartbeat/attrd > Nov 10 14:13:59 vbox3 attrd: [4392]: info: main: Starting up > Nov 10 14:13:59 vbox3 attrd: [4392]: info: crm_cluster_connect: Connecting to > OpenAIS > Nov 10 14:13:59 vbox3 attrd: [4392]: info: init_ais_connection: Creating > connection to our AIS plugin > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] ERROR: pcmk_wait_dispatch: > Child process cib terminated with signal 11 (pid=4391, core=false) > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] notice: pcmk_wait_dispatch: > Respawning failed child process: cib > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4393 for process cib > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] ERROR: pcmk_wait_dispatch: > Child process attrd terminated with signal 11 (pid=4392, core=false) > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] notice: pcmk_wait_dispatch: > Respawning failed child process: attrd > Nov 10 14:14:00 vbox3 corosync[4380]: [pcmk ] info: spawn_child: Forked > child 4394 for process attrd > and last few lines then keep repeating... > > here's gdb backtrace obtained from core files: > cib: > #0 0x00007f9f07218f48 in sem_init@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 > #1 0x00007f9f0949bf06 in coroipcc_service_connect () from > /usr/lib64/libcoroipcc.so.4 > #2 0x00007f9f096a5c37 in init_ais_connection (dispatch=0x40d516 > <cib_ais_dispatch>, destroy=0x40d658 <cib_ais_destroy>, our_uuid=0x0, > our_uname=0x616f28, nodeid=0x0) at ais.c:588 > #3 0x00007f9f096a1576 in crm_cluster_connect (our_uname=0x616f28, > our_uuid=0x0, dispatch=0x40d516, destroy=0x40d658, hb_conn=0x0) > at cluster.c:56 > #4 0x000000000040d753 in cib_init () at main.c:424 > #5 0x000000000040d08e in main (argc=1, argv=0x7fff9ec48f98) at main.c:218 > > > attrd: > #0 0x00007f194ea0cf48 in sem_init@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 > #1 0x00007f1950c8ff06 in coroipcc_service_connect () from > /usr/lib64/libcoroipcc.so.4 > #2 0x00007f1950e99c37 in init_ais_connection (dispatch=0x402891 > <attrd_ais_dispatch>, destroy=0x402af3 <attrd_ais_destroy>, > our_uuid=0x605918, our_uname=0x605910, nodeid=0x0) at ais.c:588 > #3 0x00007f1950e95576 in crm_cluster_connect (our_uname=0x605910, > our_uuid=0x605918, dispatch=0x402891, destroy=0x402af3, hb_conn=0x0) > at cluster.c:56 > #4 0x0000000000403185 in main (argc=1, argv=0x7fffd3548b38) at attrd.c:569 > > Unfortunately I'm not 100% sure that all the packages I installed on those > machines are compiled the same way, as I > deleted old (testing) packages. But the versions are the same. > Any idea where I should look for possible culprit? > thanks a lot for reply! > with best regards > nik > > > -- > ------------------------------------- > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: +420 596 603 142 > fax: +420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > ------------------------------------- > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker