On 06/08/2013, at 2:29 AM, Thomas Glanzmann <[email protected]> wrote:
> Hello Andrew, > >> You will need to run crm_report and email us the resulting tarball. >> This will include the version of the software you're running and log >> files (both system and cluster) - without which we can't do anything. > > Find the files here: > > I manually packaged it because crm_report output was empty. I can try and fix that if you re-run with -x and paste the output. > If I forget > something, please let me know. I included the daemon syslog output from > both nodes from the central syslog server and the crm file, the ha.cf > which is the same on both nodes and the /var/lib/heartbeat directory > which seems to keep all files from the first node. I can't do anything with the core file I'm afraid. I don't run debian at all, let alone that particular version with the same binaries, libraries and symbols as you. Without those, the core file is meaningless (which is why crm_report generates backtraces). > > The reason for the crash in unmanaged mode seems to be the same as > before: > > Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: crm_abort: > abort_transition_graph: Triggered assert at te_utils.c:339 : transition_graph > != NULL That shouldn't have resulted in a crash. I see a lot of this though: Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: lrm_add_rsc(870): failed to send a addrsc message to lrmd via ch_cmd channel. Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. Aug 4 18:50:27 apache-03 crmd: [29398]: ERROR: get_lrm_resource: Could not add resource nfs-common to LRM Which looks more concerning. I would _really_ recommend upgrading to something a little more recent. And it might be time to get off heartbeat while you're at it. > > Probably I should update it. > > But why the config got lost, I have no idea what went wrong here. > > https://thomas.glanzmann.de/tmp/linux_ha_crash.2013-08-05.tar.gz > > Cheers, > Thomas > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
