On 05/29/2017 09:36 PM, Andy Doan wrote:
On 05/28/2017 10:30 PM, Andy Doan wrote:
A big thunderstorm hit Austin this evening and at 9:04PM local time we
lost power to several of our most important infrastructure servers in
the lab. Most everything seems to have booted back up on its own.
However, 2 of the 3 servers providing Ceph storage to the
DeveloperCloud failed to boot back up on their own. Every VM in the
DeveloperCloud is backed by Ceph so this has caused quite a bit of
havoc. Additionally the main network node providing external access to
the cloud failed to boot back up properly.
I currently have the Ceph cluster recovering. However, its looking
like it could be a couple hours until it decides all its data is in
the proper state and can be used for write access.
The network node is still giving me lots of trouble. I'll give an
update once I have more information.
This is still down.
The Ceph cluster has been restored, but the Neutron network node is
still failing to manage external network access.
Additionally, I've found one of our top-of-rack servers will no longer
boot. If you have a server in rack 2, ie an r2-* server, you will not
have serial console access. I'm actually just bringing the server home
to try and recover stuff, so it could be a couple of days before serial
consoles are restored.
Our rack 2 top-of-rack server has been restored and serial access should
be available again.
_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-dev