On 2013-02-05T14:33:14, Ulrich Windl <[email protected]> wrote:

> I had an unexplainable failure of the stonith monitor for SBD. When examining 
> the syslog, I got the impression that RA configuration data got corrupted, 
> causing a RA failure.

Interesting. Please file a bug report.

And the easiest way is to just drop sbd_device from the configuration.
external/sbd will source /etc/sysconfig/sbd if no parameters are
specified and just work.

> I discovered more bad things: stonithd crashed:
> crmd: [9801]: info: process_lrm_event: LRM operation 
> prm_stonith_sbd:1_monitor_180000 (call=89, status=1, cib-update=0, 
> confirmed=true) Cancelled
> stonith-ng: [9797]: WARN: free_device: Removal of device 'prm_stonith_sbd:1' 
> purged operation monitor
> kernel: [  323.648355] show_signal_msg: 30 callbacks suppressed
> kernel: [  323.648361] stonithd[9797]: segfault at 0 ip 00007f70528afb94 sp 
> 00007fffaf06a410 error 4 in libcrmcommon.so.2.0.0[7f70528a4000+2d000]
> lrm-stonith: [14098]: ERROR: stonith_send_command: STONITH disconnected: 3
> lrm-stonith: [14098]: WARN: map_ra_retvalue: Mapped the invalid return code 
> -10.
> lrmd: [9798]: info: operation stop[90] on prm_stonith_sbd:1 for client 9801: 
> pid 14098 exited with return code 1
> crmd: [9801]: info: process_lrm_event: LRM operation prm_stonith_sbd:1_stop_0 
> (call=90, rc=1, cib-update=145, confirmed=true) unknown error
> [...]
> 
> It happened again (after another hard reset):
> kernel: [  300.400783] stonithd[9798]: segfault at 0 ip 00007f8e32a18b94 sp 
> 00007fffa5c954f0 error 4 in libcrmcommon.so.2.0.0[7f8e32a0d000+2d000]

Very, very much file a bug report including hb_report, which should
include a parsed coredump.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to