Possibly the node is really busy and Heartbeat isnt getting enough CPU... But I'm just guessing, I don't use Heartbeat much these days.
On Mon, Oct 5, 2009 at 3:55 PM, Johan Verrept <[email protected]> wrote: > > Hi, > > when playing with the RA at a certain point the stonith failed (it > didn't find the host in gethosts) and I rebooted the other node > manually. The result was a whole bunch of messages in my logs: > > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms) > before being called (GSource: 0x959e298) > 15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch: > started at 429631770 should have started at 429631495 > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request took too long to execute: 20 ms > (> 10 ms) (GSource: 0x959e298) > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request was delayed 2740 ms (> 1000 ms) > before being called (GSource: 0x959e300) > 15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch: > started at 429631772 should have started at 429631498 > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request took too long to execute: 30 ms > (> 10 ms) (GSource: 0x959e300) > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms) > before being called (GSource: 0x959e368) > 15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch: > started at 429631775 should have started at 429631500 > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request took too long to execute: 30 ms > (> 10 ms) (GSource: 0x959e368) > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms) > before being called (GSource: 0x959e3d0) > 15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch: > started at 429631778 should have started at 429631503 > 15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request took too long to execute: 20 ms > (> 10 ms) (GSource: 0x959e3d0) > > > with the rebooted node reporting: > > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 251 > requested. 131 is max. > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 242 > requested. 131 is max. > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 252 > requested. 131 is max. > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 251 > requested. 131 is max. > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 314 > requested. 131 is max. > 15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 252 > requested. 131 is max. > > I got about a 100 of these per second. > > What happened? How do I clean up something like this without rebooting > my cluster? > > J. > > > _______________________________________________ > Pacemaker mailing list > [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > _______________________________________________ Pacemaker mailing list [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker
