Re: ALARM - ACS reboots host servers!!!

Wido den Hollander Tue, 04 Mar 2014 06:46:06 -0800

On 03/04/2014 03:38 PM, Marcus wrote:

On Tue, Mar 4, 2014 at 3:34 AM, France <mailingli...@isg.si> wrote:

Hi Marcus and others.


There is no need to kill of the entire hypervisor, if one of the primary
storages fail.
You just need to kill the VMs and probably disable SR on XenServer, because
all other SRs and VMs have no problems.
if you kill those, then you can safely start them elsewhere. On XenServer
6.2 you call destroy the VMs which lost access to NFS without any problems.


That's a great idea, but as already mentioned, it doesn't work in
practice. You can't kill a VM that is hanging in D state, waiting on
storage. I also mentioned that it causes problems for libvirt and much
of the other system not using the storage.

Just tuning in here and Marcus is right. If NFS is hanging the processesgo into status D, both Qemu/KVM and libvirt.

The only remedy at that point to Fence of the host is a reboot, youcan't do anything with the processes which are blocking.

When you run stuff which only lives in userspace like Ceph with librbdit's a different story, but with NFS you are stuck.


If you really want to still kill the entire host and it's VMs in one go, I
would suggest live migrating the VMs which have had not lost their storage
off first, and then kill those VMs on a stale NFS by doing hard reboot.
Additional time, while migrating working VMs, would even give some grace
time for NFS to maybe recover. :-)


You won't be able to live migrate a VM that is stuck in D state, or
use libvirt to do so if one of its storage pools is unresponsive,
anyway.

Indeed, same issue again. Libivrt COMPLETELY blocks, not just onestorage pool.


Hard reboot to recover from D state of NFS client can also be avoided by
using soft mount options.


As mentioned, soft and intr very rarely actually work, in my
experience. I wish they did as I truly have come to loathe NFS for it.

Indeed, they almost never work. I've been working with NFS for over 10years now and those damn options have NEVER worked properly.


That's just the downside of having stuff go through kernel space.


I run a bunch of Pacemaker/Corosync/Cman/Heartbeat/etc clusters and we don't
just kill whole nodes but fence services from specific nodes. STONITH is
implemented only when the node looses the quorum.


Sure, but how do you fence a KVM host from an NFS server? I don't
think we've written a firewall plugin that works to fence hosts from
any NFS server. Regardless, what CloudStack does is more of a poor
man's clustering, the mgmt server is the locking in the sense that it
is managing what's going on, but it's not a real clustering service.
Heck, it doesn't even STONITH, it tries to clean shutdown, which fails
as well due to hanging NFS (per the mentioned bug, to fix it they'll
need IPMI fencing or something like that).

IPMI fencing is something I've been thinking about as well. Would be agreat benefit for the HA in CloudStack.

I didn't write the code, I'm just saying that I can completely
understand why it kills nodes when it deems that their storage has
gone belly-up. It's dangerous to leave that D state VM hanging around,
and it will until the NFS storage comes back. In a perfect world you'd
just stop the VMs that were having the issue, or if there were no VMs
you'd just de-register the storage from libvirt, I agree.

de-register won't work either... Libvirt tries a umount which will blockas well.


Wido


Regards,
F.


On 3/3/14 5:35 PM, Marcus wrote:


It's the standard clustering problem. Any software that does any sort
of avtive clustering is going to fence nodes that have problems, or
should if it cares about your data. If the risk of losing a host due
to a storage pool outage is too great, you could perhaps look at
rearranging your pool-to-host correlations (certain hosts run vms from
certain pools) via clusters. Note that if you register a storage pool
with a cluster, it will register the pool with libvirt when the pool
is not in maintenance, which, when the storage pool goes down will
cause problems for the host even if no VMs from that storage are
running (fetching storage stats for example will cause agent threads
to hang if its NFS), so you'd need to put ceph in its own cluster and
NFS in its own cluster.

It's far more dangerous to leave a host in an unknown/bad state. If a
host loses contact with one of your storage nodes, with HA, cloudstack
will want to start the affected VMs elsewhere. If it does so, and your
original host wakes up from it's NFS hang, you suddenly have a VM
running in two locations, corruption ensues. You might think we could
just stop the affected VMs, but NFS tends to make things that touch it
go into D state, even with 'intr' and other parameters, which affects
libvirt and the agent.

We could perhaps open a feature request to disable all HA and just
leave things as-is, disallowing operations when there are outages. If
that sounds useful you can create the feature request on
https://issues.apache.org/jira.


On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky <and...@arhont.com>
wrote:


Koushik, I understand that and I will put the storage into the
maintenance mode next time. However, things happen and servers crash from
time to time, which is not the reason to reboot all host servers, even those
which do not have any running vms with volumes on the nfs storage. The
bloody agent just rebooted every single host server regardless if they were
running vms with volumes on the rebooted nfs server. 95% of my vms are
running from ceph and those should have never been effected in the first
place.
----- Original Message -----

From: "Koushik Das" <koushik....@citrix.com>
To: "<us...@cloudstack.apache.org>" <us...@cloudstack.apache.org>
Cc: dev@cloudstack.apache.org
Sent: Monday, 3 March, 2014 5:55:34 AM
Subject: Re: ALARM - ACS reboots host servers!!!

The primary storage needs to be put in maintenance before doing any
upgrade/reboot as mentioned in the previous mails.

-Koushik

On 03-Mar-2014, at 6:07 AM, Marcus <shadow...@gmail.com> wrote:

Also, please note that in the bug you referenced it doesn't have a
problem with the reboot being triggered, but with the fact that reboot
never completes due to hanging NFS mount (which is why the reboot
occurs, inaccessible primary storage).

On Sun, Mar 2, 2014 at 5:26 PM, Marcus <shadow...@gmail.com> wrote:


Or do you mean you have multiple primary storages and this one was not
in use and put into maintenance?

On Sun, Mar 2, 2014 at 5:25 PM, Marcus <shadow...@gmail.com> wrote:


I'm not sure I understand. How do you expect to reboot your primary
storage while vms are running? It sounds like the host is being
fenced since it cannot contact the resources it depends on.

On Sun, Mar 2, 2014 at 3:24 PM, Nux! <n...@li.nux.ro> wrote:


On 02.03.2014 21:17, Andrei Mikhailovsky wrote:


Hello guys,


I've recently came across the bug CLOUDSTACK-5429 which has rebooted
all of my host servers without properly shutting down the guest vms.
I've simply upgraded and rebooted one of the nfs primary storage
servers and a few minutes later, to my horror, i've found out that
all
of my host servers have been rebooted. Is it just me thinking so, or
is this bug should be fixed ASAP and should be a blocker for any new
ACS release. I mean not only does it cause downtime, but also
possible
data loss and server corruption.



Hi Andrei,

Do you have HA enabled and did you put that primary storage in
maintenance
mode before rebooting it?
It's my understanding that ACS relies on the shared storage to
perform HA so
if the storage goes it's expected to go berserk. I've noticed similar
behaviour in Xenserver pools without ACS.
I'd imagine a "cure" for this would be to use network distributed
"filesystems" like GlusterFS or CEPH.

Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Re: ALARM - ACS reboots host servers!!!

Reply via email to