While investigating an ongoing issue with volume attachment (see, for example, [0]) I reinitialized our backend messaging system[1]. This reset should have been only barely user-facing, but for reasons yet unknown this restart caused many network agents to crash and, along with them, some of the cloud-vps and toolforge network.

After the initialization completed everything came back online, but it was rough for a few minutes there. Most toolforge and cloud-vps users should not need to do anything in response to this outage, but if your services are sensitive to network outages and don't know how to restart themselves, you may need to restart them.

Sorry for the downtime! Please let me know if you see any unexpected symptoms from this outage.

-Andrew


[0] https://phabricator.wikimedia.org/T397517

[1] https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/RabbitMQ

_______________________________________________
Cloud-announce mailing list -- [email protected]
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
_______________________________________________
Cloud mailing list -- [email protected]
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to