Hello, No one has issue like this ?
Regards, On 13 February 2012 11:57, Hugo Deprez <hugo.dep...@gmail.com> wrote: > Hello, > > does anyone have an idea ? > > it seems that at 13:06:38 resources et started on slave member. > But then there is something wrong on server01 : > > Feb 8 13:06:39 server01 pengine: [19469]: info: determine_online_status: > Node server01 is online > Feb 8 13:06:39 server01 pengine: [19469]: notice: unpack_rsc_op: > Operation apache2_monitor_0 found resource apache2 active on server01 > Feb 8 13:06:39 server01 pengine: [19469]: notice: group_print: Resource > Group: supervision-grp > Feb 8 13:06:39 server01 pengine: [19469]: notice: native_print: > fs-data (ocf::heartbeat:Filesystem): Stopped > Feb 8 13:06:39 server01 pengine: [19469]: notice: native_print: > nagios-ip (ocf::heartbeat:IPaddr2): Stopped > Feb 8 13:06:39 server01 pengine: [19469]: notice: native_print: > apache2 (ocf::heartbeat:apache): Started server01 > Feb 8 13:06:39 server01 pengine: [19469]: notice: native_print: > nagios (lsb:nagios3): Stopped > > > But I don't understand what fails if this is DRBD or apache2 causes the > issue. > > Any idea ? > > > > On 10 February 2012 09:39, Hugo Deprez <hugo.dep...@gmail.com> wrote: > >> Hello, >> >> please found attach to this mail the corosync logs. >> If you have any tips :) >> >> >> >> Regards, >> >> Hugo >> >> >> On 8 February 2012 15:39, Florian Haas <flor...@hastexo.com> wrote: >> >>> On Wed, Feb 8, 2012 at 2:29 PM, Hugo Deprez <hugo.dep...@gmail.com> >>> wrote: >>> > Dear community, >>> > >>> > I am currently running different corosync / drbd cluster using VM >>> running on >>> > vmware esxi host. >>> > Guest Os are Debian Squeeze. >>> > >>> > the active member of the cluster just freeze the VM was unreachable. >>> > But the resources didn't achieved to move to the other node. >>> > >>> > My cluster has the following ressources : >>> > >>> > Resource Group: grp >>> > fs-data (ocf::heartbeat:Filesystem): >>> > nagios-ip (ocf::heartbeat:IPaddr2): >>> > apache2 (ocf::heartbeat:apache): >>> > nagios (lsb:nagios3): >>> > pnp (lsb:npcd): >>> > >>> > >>> > I am currently troubleshooting this issue. I don't really know where to >>> > look. Of course I had a look at the logs, but it is pretty hard for me >>> to >>> > understand what happen. >>> >>> It's pretty hard for anyone else to understand _without_ logs. :) >>> >>> > I noticed that the VM crash at 12:09 and that the cluster only try to >>> move >>> > the ressources at 12:58, this does not make sens for me. Or maybe the >>> host >>> > wasn't totaly down ? >>> > >>> > Do you have any idea how I can troubleshoot ? >>> >>> Log analysis is where I would start. >>> >>> > Last thing, I notice that If I start apache2 on the slave server, >>> corosync >>> > didn't detect that the resource is started, could that be an issue ? >>> >>> Sure it could, but Pacemaker should happily recover from that. >>> >>> Cheers, >>> Florian >>> >>> -- >>> Need help with High Availability? >>> http://www.hastexo.com/now >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org