While investigating an ongoing issue with volume attachment (see, for
example, [0]) I reinitialized our backend messaging system[1]. This
reset should have been only barely user-facing, but for reasons yet
unknown this restart caused many network agents to crash and, along with
them, some of the cloud-vps and toolforge network.
After the initialization completed everything came back online, but it
was rough for a few minutes there. Most toolforge and cloud-vps users
should not need to do anything in response to this outage, but if your
services are sensitive to network outages and don't know how to restart
themselves, you may need to restart them.
Sorry for the downtime! Please let me know if you see any unexpected
symptoms from this outage.
-Andrew
[0] https://phabricator.wikimedia.org/T397517
[1] https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/RabbitMQ
_______________________________________________
Cloud-announce mailing list -- [email protected]
List information:
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- [email protected]
List information:
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/