Most of the outage is resolved already, we are finishing up recovering.

CloudVPS projects should be mostly functional:
* Most instances have network access at this point. The remaining VMs
without network access are currently being fixed.
* Most NFS shared storage servers have been rebooted. It's possible some
NFS clients have got stuck as a result, we're looking for those but if you
have such a project and your client is stuck, rebooting should unblock it.

Toolforge is mostly functional:
 * NFS had a hiccup and we had to reboot all the worker nodes, the
kubernetes side should be fully functional but grid is still
restarting web services,
we estimate this will take less than an hour from this point.

Paws should be fully functional.
Superset should be fully functional.

Will update as soon as everything is fixed (or in a few hours if there are
still issues).

Thanks!

On Fri, Sep 29, 2023 at 9:31 AM David Caro <dc...@wikimedia.org> wrote:

> There is an ongoing outage affecting all cloud vps projects (this includes
> toolforge and paws) that prevents the machines from getting ip refreshes
> (dchp client got uninstalled).
>
>
> We are working on it and the service should be restored soon, will update
> once everything is up and running.
>
> Working task https://phabricator.wikimedia.org/T347665
>
> Feel free to add a message there if your project is affected, we will make
> sure to verify that it's back online once we roll out the fix.
>
> Thanks for your patience!
>
_______________________________________________
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to