Hi,
I did a bt on the core, this is what I found: ========== Core was generated by `/usr/lib64/heartbeat/cib'. Program terminated with signal 11, Segmentation fault. [New process 12340] #0 0x00007f23acc553fa in strncmp () from /lib64/libc.so.6 (gdb) bt #0 0x00007f23acc553fa in strncmp () from /lib64/libc.so.6 #1 0x00007f23acf87c39 in __xmlParserInputBufferCreateFilename () from /usr/lib64/libxml2.so.2 #2 0x00007f23acf6147b in xmlNewInputFromFile () from /usr/lib64/libxml2.so.2 #3 0x00007f23acf641d4 in xmlCreateURLParserCtxt () from /usr/lib64/libxml2.so.2 #4 0x00007f23acf78f3a in xmlReadFile () from /usr/lib64/libxml2.so.2 #5 0x00007f23ad0167b1 in xmlRelaxNGParse () from /usr/lib64/libxml2.so.2 #6 0x00007f23ae967321 in validate_with_relaxng (doc=0x626020, to_logs=1, relaxng_file=0x7f23ae97ba10 "/usr/share/pacemaker/pacemaker-1.2.rng") at xml.c:2222 #7 0x00007f23ae967769 in validate_with (xml=0x6260d0, method=6, to_logs=1) at xml.c:2287 #8 0x00007f23ae967b9f in validate_xml (xml_blob=0x6260d0, validation=0x626910 "pacemaker-1.2", to_logs=1) at xml.c:2373 #9 0x0000000000405b23 in readCibXmlFile (dir=0x41b580 "/var/lib/heartbeat/crm", file=0x41c40a "cib.xml", discard_status=1) at io.c:396 #10 0x0000000000412285 in startCib (filename=0x41c40a "cib.xml") at main.c:613 #11 0x0000000000411309 in cib_init () at main.c:408 #12 0x000000000041064a in main (argc=1, argv=0x7fff942e0f58) at main.c:218 ========== If it's a fresh install let's say then cib.xml will not exist. Then why is it looking for this file on startup. Sincerely Shravan On Tue, Sep 28, 2010 at 10:24 AM, Shravan Mishra <shravan.mis...@gmail.com> wrote: > Sorry forgot to attach my corosync.conf. > > > ========= > totem { > version: 2 > # token: 3000 > # token_retransmits_before_loss_const: 10 > # join: 60 > # consensus: 1500 > # vsftype: none > # max_messages: 20 > # clear_node_high_bit: yes > secauth: off > threads: 0 > # rrp_mode: passive > > interface { > ringnumber: 0 > bindnetaddr: 192.168.2.0 > #mcastaddr: 226.94.1.1 > broadcast: yes > mcastport: 5405 > } > # interface { > # ringnumber: 1 > # bindnetaddr: 172.20.20.0 > #mcastaddr: 226.94.1.1 > # broadcast: yes > # mcastport: 5405 > # } > } > > logging { > fileline: off > to_stderr: yes > to_logfile: yes > to_syslog: yes > logfile: /tmp/corosync.log > debug: off > timestamp: on > logger_subsys { > subsys: AMF > debug: off > } > } > > service { > name: pacemaker > ver: 0 > } > > aisexec { > user:root > group: root > } > > amf { > mode: disabled > } > > > > > ========= > > On Tue, Sep 28, 2010 at 10:10 AM, Shravan Mishra > <shravan.mis...@gmail.com> wrote: >> Hi Andrew, >> >> I'm attaching another log file as I reflashed my machine started >> everything from scratch. >> Looks like my old system got little messed up as I was trying to >> install old HA libraries - corosyc/pacemaker that was initially >> working for me. >> >> >> Here are the details: >> >> As of now I just want to see cib/attrd up so I have only one machine >> where I want to see things in a sane state. >> >> [r...@ha2 ~]# /usr/sbin/corosync -v >> Corosync Cluster Engine, version '1.2.8' SVN revision '3035' >> Copyright (c) 2006-2009 Red Hat, Inc. >> >> [r...@ha2 ~]# /usr/lib64/heartbeat/crmd version >> CRM Version: 1.1.2 (e0d731c2b1be446b27a73327a53067bf6230fb6a) >> >> >> >> Pacemaker version is 1.1, the release based on the above output is >> 1.1.2 if I correctly understand. >> >> This one is showing -- >> >> Sep 27 12:30:45 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process cib terminated with signal 11 (pid=9216, core=false) >> >> >> Please find corosync logs attached. >> >> Thanks >> Shravan >> >> >> On Tue, Sep 28, 2010 at 5:47 AM, Andrew Beekhof <and...@beekhof.net> wrote: >>> On Mon, Sep 27, 2010 at 6:26 AM, Shravan Mishra >>> <shravan.mis...@gmail.com> wrote: >>>> Thanks Raoul for the response. >>>> >>>> Changing the permission to hacluster:haclient did stop that error. >>>> >>>> Now I'm hitting another problem whereby cib is failing to start >>> >>> Very strange logs. >>> Which distribution is this? >>> What does your corosync.conf look like? >>> >>> >>>> ===== >>>> Sep 27 00:16:29 corosync [pcmk ] info: update_member: Node >>>> ha2.itactics.com now has process list: >>>> 00000000000000000000000000110012 (1114130) >>>> Sep 27 00:16:29 corosync [pcmk ] info: update_member: Node >>>> ha2.itactics.com now has 1 quorum votes (was 0) >>>> Sep 27 00:16:29 corosync [pcmk ] info: send_member_notification: >>>> Sending membership update 100 to 0 children >>>> Sep 27 00:16:29 corosync [MAIN ] Completed service synchronization, >>>> ready to provide service. >>>> Sep 27 00:16:30 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >>>> process cib exited (pid=14889, rc=127) >>>> Sep 27 00:16:30 corosync [pcmk ] notice: pcmk_wait_dispatch: >>>> Respawning failed child process: cib >>>> Sep 27 00:16:30 corosync [pcmk ] info: spawn_child: Forked child >>>> 14896 for process cib >>>> crmd[14893]: 2010/09/27_00:16:30 WARN: do_cib_control: Couldn't >>>> complete CIB registration 1 times... pause and retry >>>> Sep 27 00:16:31 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >>>> process cib exited (pid=14896, rc=127) >>>> Sep 27 00:16:31 corosync [pcmk ] notice: pcmk_wait_dispatch: >>>> Respawning failed child process: cib >>>> Sep 27 00:16:31 corosync [pcmk ] info: spawn_child: Forked child >>>> 14901 for process cib >>>> Sep 27 00:16:32 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >>>> process cib exited (pid=14901, rc=1 >>>> ====== >>>> >>>> >>>> I have attached the full logs. >>>> >>>> We are using corosync 1.2.8 and pacemaker 1.1.3. >>>> >>>> >>>> Thanks. >>>> Shravan >>>> >>>> >>>> >>>> On Sat, Sep 25, 2010 at 4:36 AM, Raoul Bhatia [IPAX] <r.bha...@ipax.at> >>>> wrote: >>>>> On 24.09.2010 21:41, Shravan Mishra wrote: >>>>>> >>>>>> crmd[20612]: 2010/09/24_15:29:57 ERROR: crm_log_init_worker: Cannot >>>>>> change active directory to /var/lib/heartbeat/cores/hacluster: >>>>>> Permission denied (13) >>>>> >>>>> ls -ald /var/lib/heartbeat/cores/hacluster /var/lib/heartbeat/cores/ >>>>> /var/lib/heartbeat/ /var/lib/ /var/ >>>>> >>>>> is haclient allowed to cd all the way into >>>>> /var/lib/heartbeat/cores/hacluster ? >>>>> >>>>> cheers, >>>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: >>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> >>>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>> >> > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker