Il 09/02/2011 20:00, J. Ryan Earl ha scritto:
On Wed, Feb 9, 2011 at 6:30 AM, Dario Fiumicello - Antek<
[email protected]>  wrote:

Hi all, I have two Virtualbox VM running on two different physical hosts.
The vm are interconnected with two gigabit ethernet for drbd sync and
heartbeat.

Suddenly I get this on master machine:

Feb  9 10:53:24 mail1 kernel: [136200.650336] INFO: task jbd2/drbd0-8:13739
blocked for more than 120 seconds.
Feb  9 10:53:24 mail1 kernel: [136200.650967] "echo 0>
/proc/sys/kernel/hung_task_timeout_secs" disables this message.

This is a warning, not an error.  It simply states that a some tasks has
been working for more than 2 minutes.  Some tasks legitimately take more
than 120 seconds to complete, the above is simply informative.


And from this moment many other errors of blocked tasks appears (postfix,
pickup and so on). The machine load was more than 25!

It sounds like the DRBD block device is hung due to slow I/O response from
one of the backing-devices on your VMs.


Obviously I cannot use the machine anymore and I needed to kill it in order
to force the takeover on the slave. Halt didn't work either.

That's not obvious at all.  Your system shouldn't be entirely on DRBD. Even
if your DRBD block device is unresponsive you should still be able to login
and look around.  What was your CPU load?

Sorry, I wasn't precise. The machine still allows me to login via ssh and check its status. Uptime shows me a CPU load of 25 while top shows quite 0% of occupied cpu. I suspect this is dued to a hang in I/O. The services I wasn't able to use was the ones using drbd (like postfix, dovecot and so on). When I tried to force a takeover on the other machine I wasn't able to do it because the master (hanged) didn't release resources.

My question is: why did I get this error? What can I do to avoid it?

You got this error because one of your VMs likely couldn't keep up, likely
caused by load on one of the host servers.  You can avoid it by going
bare-metal.

The VMs are on different host servers right?
Exactly, vm's are on two sibiling hosts, each with three Sata HD in raid1.

Thank you for your answer, cheers

--
Dario Fiumicello - Antek S.r.l.
+3902890380 73


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to