Nikola,
Sorry, I don't have a solution, but I'm curious about your setup.
Which version of DLM are you using?  Did you have to compile it
yourself?

Regards,
Mark

On Tue, Nov 10, 2009 at 7:28 AM, Nikola Ciprich <extmaill...@linuxbox.cz> wrote:
> Hello Andrew et al,
> few days ago, I asked about pacemaker + corosync + clvmd etc. With Your 
> advice, I got this working well.
> It was in testing virtual machines, I'm now trying to install similar setup 
> on raw hardware but for some
> reasong attrd and cib seem to be crashing.
>
> here's snippet from corosync log:
> Nov 10 14:12:21 vbox3 corosync[4299]:   [MAIN  ] Corosync Cluster Engine 
> ('1.1.2'): started and ready to provide service.
> Nov 10 14:12:21 vbox3 corosync[4299]:   [MAIN  ] Corosync built-in features: 
> nss rdma
> Nov 10 14:12:21 vbox3 corosync[4299]:   [MAIN  ] Successfully read main 
> configuration file '/etc/corosync/corosync.conf'.
> Nov 10 14:12:21 vbox3 corosync[4299]:   [TOTEM ] Initializing transport 
> (UDP/IP).
> Nov 10 14:12:21 vbox3 corosync[4299]:   [TOTEM ] Initializing 
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Nov 10 14:12:21 vbox3 corosync[4299]:   [MAIN  ] Compatibility mode set to 
> whitetank.  Using V1 and V2 of the synchronization engine.
> Nov 10 14:12:21 vbox3 corosync[4299]:   [TOTEM ] The network interface 
> [10.58.0.1] is now up.
> Nov 10 14:12:21 vbox3 corosync[4299]:   [pcmk  ] info: process_ais_conf: 
> Reading configure
> Nov 10 14:13:16 vbox3 corosync[4348]:   [MAIN  ] Corosync Cluster Engine 
> ('1.1.2'): started and ready to provide service.
> Nov 10 14:13:16 vbox3 corosync[4348]:   [MAIN  ] Corosync built-in features: 
> nss rdma
> Nov 10 14:13:16 vbox3 corosync[4348]:   [MAIN  ] Successfully read main 
> configuration file '/etc/corosync/corosync.conf'.
> Nov 10 14:13:16 vbox3 corosync[4348]:   [TOTEM ] Initializing transport 
> (UDP/IP).
> Nov 10 14:13:16 vbox3 corosync[4348]:   [TOTEM ] Initializing 
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Nov 10 14:13:16 vbox3 corosync[4348]:   [MAIN  ] Compatibility mode set to 
> whitetank.  Using V1 and V2 of the synchronization engine.
> Nov 10 14:13:16 vbox3 corosync[4348]:   [TOTEM ] The network interface 
> [10.58.0.1] is now up.
> Nov 10 14:13:16 vbox3 corosync[4348]:   [pcmk  ] info: process_ais_conf: 
> Reading configure
> Nov 10 14:13:24 vbox3 corosync[4357]:   [MAIN  ] Corosync Cluster Engine 
> ('1.1.2'): started and ready to provide service.
> Nov 10 14:13:24 vbox3 corosync[4357]:   [MAIN  ] Corosync built-in features: 
> nss rdma
> Nov 10 14:13:24 vbox3 corosync[4357]:   [MAIN  ] Successfully read main 
> configuration file '/etc/corosync/corosync.conf'.
> Nov 10 14:13:24 vbox3 corosync[4357]:   [TOTEM ] Initializing transport 
> (UDP/IP).
> Nov 10 14:13:24 vbox3 corosync[4357]:   [TOTEM ] Initializing 
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Nov 10 14:13:24 vbox3 corosync[4357]:   [MAIN  ] Compatibility mode set to 
> whitetank.  Using V1 and V2 of the synchronization engine.
> Nov 10 14:13:24 vbox3 corosync[4357]:   [TOTEM ] The network interface 
> [10.58.0.1] is now up.
> Nov 10 14:13:24 vbox3 corosync[4357]:   [pcmk  ] info: process_ais_conf: 
> Reading configure
> Nov 10 14:13:57 vbox3 corosync[4380]:   [MAIN  ] Corosync Cluster Engine 
> ('1.1.2'): started and ready to provide service.
> Nov 10 14:13:57 vbox3 corosync[4380]:   [MAIN  ] Corosync built-in features: 
> nss rdma
> Nov 10 14:13:57 vbox3 corosync[4380]:   [MAIN  ] Successfully read main 
> configuration file '/etc/corosync/corosync.conf'.
> Nov 10 14:13:57 vbox3 corosync[4380]:   [TOTEM ] Initializing transport 
> (UDP/IP).
> Nov 10 14:13:57 vbox3 corosync[4380]:   [TOTEM ] Initializing 
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Nov 10 14:13:57 vbox3 corosync[4380]:   [MAIN  ] Compatibility mode set to 
> whitetank.  Using V1 and V2 of the synchronization engine.
> Nov 10 14:13:58 vbox3 corosync[4380]:   [TOTEM ] The network interface 
> [10.58.0.1] is now up.
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: process_ais_conf: 
> Reading configure
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: config_find_init: 
> Local handle: 9213452461992312833 for logging
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: config_find_next: 
> Processing additional logging options...
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: get_config_opt: Found 
> 'off' for option: debug
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: get_config_opt: 
> Defaulting to 'off' for option: to_file
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: get_config_opt: 
> Defaulting to 'daemon' for option: syslog_facility
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: config_find_init: 
> Local handle: 2013064636357672962 for service
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: config_find_next: 
> Processing additional service options...
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: get_config_opt: 
> Defaulting to 'no' for option: use_logd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: get_config_opt: Found 
> 'no' for option: use_mgmtd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_startup: CRM: 
> Initialized
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] Logging: Initialized 
> pcmk_startup
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_startup: Maximum 
> core file size is: 18446744073709551615
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_startup: Service: 
> 9
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_startup: Local 
> hostname: vbox3
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_update_nodeid: 
> Local node id: 16792074
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: update_member: 
> Creating entry for node 16792074 born on 0
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: update_member: 
> 0x260ee10 Node 16792074 now known as vbox3 (was: (null))
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: update_member: Node 
> vbox3 now has 1 quorum votes (was 0)
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: update_member: Node 
> 16792074/vbox3 is now: member
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4384 for process stonithd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4385 for process cib
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4386 for process lrmd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4387 for process attrd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4388 for process pengine
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4389 for process crmd
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> Pacemaker Cluster Manager 1.0.6
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync extended virtual synchrony service
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync configuration service
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync cluster closed process group service v1.01
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync cluster config database access v1.01
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync profile loading service
> Nov 10 14:13:58 vbox3 corosync[4380]:   [SERV  ] Service engine loaded: 
> corosync cluster quorum service v0.1
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_peer_update: 
> Transitional membership event on ring 4: memb=0, new=0, lost=0
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 10
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 12
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: Stack hogger failed 0xffffffff
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_peer_update: 
> Stable membership event on ring 4: memb=1, new=1, lost=0
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_peer_update: NEW: 
>  vbox3 16792074
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_peer_update: 
> MEMB: vbox3 16792074
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: update_member: Node 
> vbox3 now has process list: 00000000000000000000000000013312 (78610)
> Nov 10 14:13:58 vbox3 corosync[4380]:   [TOTEM ] A processor joined or left 
> the membership and a new membership was formed.
> Nov 10 14:13:58 vbox3 corosync[4380]:   [MAIN  ] Completed service 
> synchronization, ready to provide service.
> Nov 10 14:13:58 vbox3 cib: [4385]: info: Invoked: /usr/lib64/heartbeat/cib
> Nov 10 14:13:58 vbox3 cib: [4385]: info: G_main_add_TriggerHandler: Added 
> signal manual handler
> Nov 10 14:13:58 vbox3 cib: [4385]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 17
> Nov 10 14:13:58 vbox3 cib: [4385]: info: retrieveCib: Reading c
> Nov 10 14:13:58 vbox3 cib: [4385]: WARN: retrieveCib: Cluster configuration 
> not found: /var/lib/heartbeat/crm/cib.xml
> Nov 10 14:13:58 vbox3 cib: [4385]: WARN: readCibXmlFile: Primary 
> configuration corrupt or unusable, trying backup...
> Nov 10 14:13:58 vbox3 cib: [4385]: WARN: readCibXmlFile: Continuing with an 
> empty configuration.
> Nov 10 14:13:58 vbox3 cib: [4385]: info: startCib: CIB Initialization 
> completed successfully
> Nov 10 14:13:58 vbox3 cib: [4385]: info: crm_cluster_connect: Connecting to 
> OpenAIS
> Nov 10 14:13:58 vbox3 cib: [4385]: info: init_ais_connection: Creating 
> connection to our AIS plugin
> Nov 10 14:13:58 vbox3 crmd: [4389]: info: Invoked: /usr/lib64/heartbeat/crmd
> Nov 10 14:13:58 vbox3 crmd: [4389]: info: main: CRM Hg Version: 
> cebe2b6ff49b36b29a3bd7ada1c4701c7470febe
> Nov 10 14:13:58 vbox3 crmd: [4389]: info: crmd_init: Starting crmd
> Nov 10 14:13:58 vbox3 crmd: [4389]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 17
> Nov 10 14:13:58 vbox3 pengine: [4388]: info: Invoked: 
> /usr/lib64/heartbeat/pengine
> Nov 10 14:13:58 vbox3 pengine: [4388]: info: main: Starting pengine
> Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 15
> Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 17
> Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 10
> Nov 10 14:13:58 vbox3 lrmd: [4386]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 12
> Nov 10 14:13:58 vbox3 lrmd: [4386]: info: Started.
> Nov 10 14:13:58 vbox3 attrd: [4387]: info: Invoked: /usr/lib64/heartbeat/attrd
> Nov 10 14:13:58 vbox3 attrd: [4387]: info: main: Starting up
> Nov 10 14:13:58 vbox3 attrd: [4387]: info: crm_cluster_connect: Connecting to 
> OpenAIS
> Nov 10 14:13:58 vbox3 attrd: [4387]: info: init_ais_connection: Creating 
> connection to our AIS plugin
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_cluster_connect: Connecting 
> to OpenAIS
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: init_ais_connection: Creating 
> connection to our AIS plugin
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: init_ais_connection: AIS 
> connection established
> Nov 10 14:13:58 vbox3 corosync[4380]:   [pcmk  ] info: pcmk_ipc: Recorded 
> connection 0x2615120 for stonithd/4384
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: get_ais_nodeid: Server details: 
> id=16792074 uname=vbox3
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_new_peer: Node vbox3 now 
> has id: 16792074
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: crm_new_peer: Node 16792074 is 
> now known as vbox3
> Nov 10 14:13:58 vbox3 stonithd: [4384]: notice: /usr/lib64/heartbeat/stonithd 
> start up successfully.
> Nov 10 14:13:58 vbox3 stonithd: [4384]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 17
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] ERROR: pcmk_wait_dispatch: 
> Child process cib terminated with signal 11 (pid=4385, core=false)
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_wait_dispatch: 
> Respawning failed child process: cib
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4391 for process cib
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] ERROR: pcmk_wait_dispatch: 
> Child process attrd terminated with signal 11 (pid=4387, core=false)
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_wait_dispatch: 
> Respawning failed child process: attrd
> Nov 10 14:13:59 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4392 for process attrd
> Nov 10 14:13:59 vbox3 crmd: [4389]: info: do_cib_control: Could not connect 
> to the CIB service: connection failed
> Nov 10 14:13:59 vbox3 crmd: [4389]: WARN: do_cib_control: Couldn't complete 
> CIB registration 1 times... pause and retry
> Nov 10 14:13:59 vbox3 crmd: [4389]: info: crmd_init: Starting crmd's mainloop
> Nov 10 14:13:59 vbox3 cib: [4391]: info: Invoked: /usr/lib64/heartbeat/cib
> Nov 10 14:13:59 vbox3 cib: [4391]: info: G_main_add_TriggerHandler: Added 
> signal manual handler
> Nov 10 14:13:59 vbox3 cib: [4391]: info: G_main_add_SignalHandler: Added 
> signal handler for signal 17
> Nov 10 14:13:59 vbox3 cib: [4391]: info: retrieveCib: Reading cluster 
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest: 
> /var/lib/heartbeat/
> Nov 10 14:13:59 vbox3 cib: [4391]: WARN: retrieveCib: Cluster configuration 
> not found: /var/lib/heartbeat/crm/cib.xml
> Nov 10 14:13:59 vbox3 cib: [4391]: WARN: readCibXmlFile: Primary 
> configuration corrupt or unusable, trying backup...
> Nov 10 14:13:59 vbox3 cib: [4391]: WARN: readCibXmlFile: Continuing with an 
> empty configuration.
> Nov 10 14:13:59 vbox3 cib: [4391]: info: startCib: CIB Initialization 
> completed successfully
> Nov 10 14:13:59 vbox3 cib: [4391]: info: crm_cluster_connect: Connecting to 
> OpenAIS
> Nov 10 14:13:59 vbox3 cib: [4391]: info: init_ais_connection: Creating 
> connection to our AIS plugin
> Nov 10 14:13:59 vbox3 attrd: [4392]: info: Invoked: /usr/lib64/heartbeat/attrd
> Nov 10 14:13:59 vbox3 attrd: [4392]: info: main: Starting up
> Nov 10 14:13:59 vbox3 attrd: [4392]: info: crm_cluster_connect: Connecting to 
> OpenAIS
> Nov 10 14:13:59 vbox3 attrd: [4392]: info: init_ais_connection: Creating 
> connection to our AIS plugin
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] ERROR: pcmk_wait_dispatch: 
> Child process cib terminated with signal 11 (pid=4391, core=false)
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_wait_dispatch: 
> Respawning failed child process: cib
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4393 for process cib
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] ERROR: pcmk_wait_dispatch: 
> Child process attrd terminated with signal 11 (pid=4392, core=false)
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] notice: pcmk_wait_dispatch: 
> Respawning failed child process: attrd
> Nov 10 14:14:00 vbox3 corosync[4380]:   [pcmk  ] info: spawn_child: Forked 
> child 4394 for process attrd
> and last few lines then keep repeating...
>
> here's gdb backtrace obtained from core files:
> cib:
> #0  0x00007f9f07218f48 in sem_init@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
> #1  0x00007f9f0949bf06 in coroipcc_service_connect () from 
> /usr/lib64/libcoroipcc.so.4
> #2  0x00007f9f096a5c37 in init_ais_connection (dispatch=0x40d516 
> <cib_ais_dispatch>, destroy=0x40d658 <cib_ais_destroy>, our_uuid=0x0,
>    our_uname=0x616f28, nodeid=0x0) at ais.c:588
> #3  0x00007f9f096a1576 in crm_cluster_connect (our_uname=0x616f28, 
> our_uuid=0x0, dispatch=0x40d516, destroy=0x40d658, hb_conn=0x0)
>    at cluster.c:56
> #4  0x000000000040d753 in cib_init () at main.c:424
> #5  0x000000000040d08e in main (argc=1, argv=0x7fff9ec48f98) at main.c:218
>
>
> attrd:
> #0  0x00007f194ea0cf48 in sem_init@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
> #1  0x00007f1950c8ff06 in coroipcc_service_connect () from 
> /usr/lib64/libcoroipcc.so.4
> #2  0x00007f1950e99c37 in init_ais_connection (dispatch=0x402891 
> <attrd_ais_dispatch>, destroy=0x402af3 <attrd_ais_destroy>,
>    our_uuid=0x605918, our_uname=0x605910, nodeid=0x0) at ais.c:588
> #3  0x00007f1950e95576 in crm_cluster_connect (our_uname=0x605910, 
> our_uuid=0x605918, dispatch=0x402891, destroy=0x402af3, hb_conn=0x0)
>    at cluster.c:56
> #4  0x0000000000403185 in main (argc=1, argv=0x7fffd3548b38) at attrd.c:569
>
> Unfortunately I'm not 100% sure that all the packages I installed on those 
> machines are compiled the same way, as I
> deleted old (testing) packages. But the versions are the same.
> Any idea where I should look for possible culprit?
> thanks a lot for reply!
> with best regards
> nik
>
>
> --
> -------------------------------------
> Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 01 Ostrava
>
> tel.:   +420 596 603 142
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: ser...@linuxbox.cz
> -------------------------------------
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to