Dňa 7/9/2013 12:56 PM Andrew Beekhof wrote / napísal(a): > > On 09/07/2013, at 8:49 PM, Martin Gazak <martin.ga...@microstep-mis.sk> wrote: > >> Dňa 7/9/2013 12:42 PM Andrew Beekhof wrote / napísal(a): >>> >>> On 09/07/2013, at 5:05 PM, Martin Gazak <martin.ga...@microstep-mis.sk> >>> wrote: > > It looks to be a bug in 1.1.7, you'll want to contact SUSE so they can get > the fix from upstream.
Dear Andrew, thanks for your effort. May I have 3 questions: - what version did you use to detect a bug ? - you labeled it just "current version" ? - we have downloaded corosync SuSE packages 1.1.8 and 1.1.9 - could you please confirm one (or both) SuSE versions have this bug fixed ? Or you need the package itself as attachment to inspect it ? Or is there a way how to check our package has the bug fixed ? - we are going to test the package 1.1.9 anyway with the stress tests. As I wrote you, such situation happened extremely rarely on the testing cluster (however often enough to make troubles in production environment). Do you have any idea how to reproduce this situation in a deterministic way ? Just blind killing of master instance of the application from cron does not help - the system survived correct 70+ failovers over the weekend. Best regards Martin Gazak > > Your version: > > Jul 04 23:45:02 ims0 pengine: [3933]: WARN: unpack_rsc_op: Processing failed > op ims:0_last_failure_0 on ims0: not running (7) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Recover ims:0 > (Master ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart ims-ip > (Started ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart ims-ip-src > (Started ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: process_pe_message: Transition > 4036: PEngine Input stored in: /var/lib/pengine/pe-input-2819.bz2 > > vs. the current version: > > notice: LogActions: Demote ims:0 (Master -> Stopped ims0) > notice: LogActions: Promote ims:1 (Slave -> Master ims1) > notice: LogActions: Start ims-ip (ims1) > notice: LogActions: Start ims-ip-src (ims1) > > and > > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Recover ims:0 > (Master ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart ims-ip > (Started ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Start ims-ip-src > (ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: process_pe_message: Transition > 4037: PEngine Input stored in: /var/lib/pengine/pe-input-2820.bz2 > > > vs. the current version: > > notice: LogActions: Demote ims:0 (Master -> Stopped ims0) > notice: LogActions: Promote ims:1 (Slave -> Master ims1) > notice: LogActions: Start ims-ip (ims1) > notice: LogActions: Start ims-ip-src (ims1) > -- Regards, Martin Gazak MicroStep-MIS, spol. s r.o. System Development Manager Tel.: +421 2 602 00 128 Fax: +421 2 602 00 180 martin.ga...@microstep-mis.sk http://www.microstep-mis.com _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org