Hello!

I have done the shutdown and reboot as requested -- please let me know if you see any bad effects.

Regarding the Horizon login... this may be a case of https://phabricator.wikimedia.org/T383370, in which case clearing cookies or trying a different browser would work. However, I've discovered a different Horizon login issue while traveling, which is that the network where I'm staying today blocks high-numbered port access; a complete login to Horizon requires access to https://openstack.codfw1dev.wikimediacloud.org:25000 (as part of authentication). So if you have a way to unblock port 25000 that may also solve the problem.

Please let me know if you sort out the Horizon login issue, I'm concerned that the high-port issue might be a widespread issue that I just haven't heard much about.

Thank you!

-Andrew


On 2/2/25 5:33 AM, Dirk Hünniger via Cloud wrote:
Hi Andrew,

I just added a @reboot line to the crontab of the mediawiki2latex instance. And everything came up well after I rebooted from the command line. Unfortunately I currently cannot log into horizon at the moment, so I cannot issue a hard reset as you requested. But it is perfectly Ok for me if you do it any time you like. I just think it is a good idea to shutdown the machine normally before you hard-reset it in order to avoid any data corruption.

Thanks a lot for your help.

Yours Dirk

On 1/31/25 16:27, Andrew Bogott wrote:
The issue that resulted in partial VM reboots last week[0] (see "VM reboots coming Tuesday, 2024-01-20") turns out to be more widespread, affecting virtually all instances. The primary symptom is that it restricts our ability to drain and maintain WMCS hardware.

In order to resolve the issue, all that's needed is a hard reboot of each VM. Note that a simple in-place reboot (for instance, issued at in the shell of the VM) does NOT resolve the issue. Hard reboots must be performed via Horizon, by selecting 'Hard Reboot Instance' on the Instances panel.

So, please, at your convenience, log into Horizon and hard reboot any of your VMs that appear on the list below. For informal coordination I have also populated an etherpad[1] with the list of affected VMs.

Anything that remains in need of a reboot by late next week (Thursday, February 6th) I will reboot for you. So, if you don't care when/if your VM is rebooted you can ignore this message :)

Thank you! And, sorry for the inconvenience. We have a plan in place[2] which should prevent this issue from re-appearing in the future.

-Andrew

_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to