Hello!
Next week we'll be rebuilding and upgrading the hardware that provides
DNS service to cloud-vps and toolforge. These rebuilds will start at
14:00 UTC and the whole process may take 2-3 hours. It's likely that DNS
lookups will be somewhat slower as clients fail over between the
in-progress and the working server. In theory there should be few other
user-facing effects from these upgrades.
In practice, though, this isn't something that we've done for quite a
while, and touching DNS is always risky since it underlies pretty much
everything. Here are some things to be ready for:
- As a precaution we'll be disabling Horizon during the window to
prevent new VMs or DNS changes landing in an inconsistent state.
- Some badly-behaved DNS clients won't fail over properly and will
report errors when their primary DNS server is down.
- Puppet will almost certainly experience transient failures, since
Puppet is known to be one of those badly-behaved clients.
- If things go very badly there may be periods of total DNS outage which
will result in many WMCS-hosted services failing. There's no particular
reason that this /should/ happen, but this is the worst-case scenario.
For additional context, the phabricator task for this work is
https://phabricator.wikimedia.org/T253780
- Andrew + the WMCS team
_______________________________________________
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud