Hi, On Tue, Sep 18, 2012 at 11:18:54AM +0200, Fernando Pereira wrote: > Hi there. > This is my first post for this list as I haven't had problems with > heartbeat, until now :) > > We have a dual server fail-back configuration in place, in which the two > servers have identical resources (nfs, drbd...). > Last week I upgraded a system and replaced one of the servers by a virtual > machine and installed the latest available version of heartbeat available > via yum (3.0.4). > > Since then Im having a lot of problems with "Late heartbeat" and false dead > nodes. Before we could have a "Dead time" of 10sec, while now 30 is not > enough. > > Looking into the log files I could find the following entry, among other > similar: > "Gmain_timeout_dispatch: Dispatch function for send local status was > delayed 30590 ms (> 1010 ms) before being called (GSource: 0x14209a0)" > > I guess it means that for some reason the function call took over 30 > seconds?? > In my understanding this number is, at least, three orders of magnitude > higher than any acceptable value, even under the worst machine load > scenarios. > Is there a known problem with this version of heartbeat? Or does anybody > experiences this kind of problems when running over a virtual machine (ESXi > 5.0)?
I'd suspect a scheduler issue. The VM is probably starved, hence that long delays. You should check the vmware docs or forums. Thanks, Dejan > Thanks a lot for any help. > Cheers > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
