[Cloud] [Cloud-announce] Fwd: Daily VM migrations starting Monday, September 14th -- includes bastion outages

2020-09-08 Thread Andrew Bogott
These changes are now in effect.  Please let me know if you see any 
unexpected behavior.


(btw, there was just now a hardware issue in the datacenter which caused 
some bad behavior in toolforge.  That's unrelated to the naming change, 
and should be largely resolved now.)



-Andrew



 Forwarded Message 
Subject: 	Daily VM migrations starting Monday, September 14th -- 
includes bastion outages

Date:   Wed, 2 Sep 2020 11:22:29 -0500
From:   Andrew Bogott 
Reply-To:   andrewbog...@gmail.com
To: cloud-annou...@lists.wikimedia.org



tl;dr #1: Some VMs will have brief downtime the week of the 14th; check 
the lists at the bottom of this email for affected instances and timing.


tl;dr #2: Several bastions (including secondary-bastion.wmcloud.org) 
will be moved and rebooted at 14:00 UTC on Monday the 14th.


tl;dr #3: New ‘g2’ VM flavors will soon be available in Horizon, at 
which point you are discouraged from using the old ‘m1’ names.


tl;dr #4: Don’t let this announcement distract from the other important 
thing that’s happening: the deprecation of the .wmflabs domain for new 
VMs next week



== what's happening ==

In a few weeks we will begin moving VMs to our new storage platform, 
Ceph[0]. This move requires a full shutdown of each VM while it is 
copied over. We'll begin by evacuating our oldest hypervisors, 
cloudvirt1001-1009, two per day during the week of the 14th. Only one VM 
will be moved at a time, but the timing will be unpredictable for any 
given server.


To avoid unpredictable interruptions to ongoing work, I'm going to move 
the following bastion hosts first, at 14:00 UTC (that's 7AM Pacific 
time) on Monday the 14th.  Those bastions are:


- bastion-eqiad1-02 (AKA instance-bastion-eqiad1-02.bastion.wmflabs.org 
AKA instance-bastion-eqiad1-02.bastion.wmcloud.org aka 
secondary.bastion.wmcloud.org AKA secondary.bastion.wmflabs.org)


- bastion-restricted-eqiad1-01

- tools-sgebastion-09


== before the move ==

If your VMs appear in the list below you should either plan a three-hour 
downtime on the day listed (14:00-17:00 UTC), or contact me on IRC to 
have your VM moved by hand ahead of time.


In the days preceding this move, you will see several new flavor options 
appear in the Horizon interface for new VMs.  They will have standard 
stats-based names preceded by ‘g2’, for example 
‘g2.cores1.ram2.disk80'.  These new flavors will be bound to the Ceph 
backend such that any new VMs created with those flavors will be run on 
new hypervisors and stored on the Ceph backend. You're encouraged to 
start using these new flavors as soon as they appear.



== during the move ==

Each VM will be shutdown, copied, have its flavor adjusted, and then 
restarted. The total downtime will vary depending on the size of the VM 
but will generally be measured in minutes rather than hours.



== after the move ==

Migrated VMs will display in Horizon with a new flavor name.  The new 
flavors will have the same specs (cores, ram,  disk) as the former 
flavor but will include Ceph-specific metadata.


A side-effect of the move is that VMs from cloudvirts1001-1009 will be 
running on much newer hardware, so CPU-intensive activities should be 
quite a bit faster.  File IO will be moderately slower than before.  If 
you have a workflow that is rendered impractical by the IO changes 
please open a phabricator task to discuss your options.



== eventually ==

After the dust has settled from the first round of migrations (probably 
sometime during the week of the 23rd) we will disable creation of new 
non-Ceph VMs.  That means that the old "m1.small"-style flavors will 
still display in Horizon and openstack-browser but only VMs marked with 
new ‘g2’ flavor names will build successfully.


Remaining VMs (on cloudvirts1012 through 1030) will be moved to Ceph in 
future weeks. Keep an eye out for emails announcing such moves.



== Schedule ==

Monday, 2020-09-14, 14:00-17:00 UTC: cloudvirt1001

cn-staging-1.centralnotice-staging.eqiad1.wikimedia.cloud 
(178d85a9-cc6f-4b61-bc3a-eaedc7e4a219)


cloud-puppetmaster-04.cloudinfra.eqiad1.wikimedia.cloud 
(5ea5ac40-43a7-42f3-b986-26f5803b89fc)


deploy-1002.devtools.eqiad1.wikimedia.cloud 
(8ffecd5f-4de6-4c89-904a-3879612da6a5)


dumps-4.dumps.eqiad1.wikimedia.cloud (8b8e9f64-4491-4cb0-85b1-f41e06772a2c)

pontoon-puppet-01.monitoring.eqiad1.wikimedia.cloud 
(43cdcd6e-7259-41b1-ae0d-8c4d8c1e2977)


ores-worker-02.ores.eqiad1.wikimedia.cloud 
(0ff5345d-53b5-4c75-998d-c7ee235c469e)


ores-misc-01.ores-staging.eqiad1.wikimedia.cloud 
(ee7b9541-ee56-4c83-ba6e-221a5427eb61)


wikibase-scisrc.sciencesource.eqiad1.wikimedia.cloud 
(da82ab0b-1a42-42fe-b139-77586aadef80)


techblog-puppetmaster-01.techblog.eqiad1.wikimedia.cloud 
(e16ebfcc-ee77-4f70-9057-dc5fa5fc900c)


tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud 
(963ec21f-7976-4765-87c2-fe66e4b1538d)


toolsbeta-workflow-test.toolsbeta.eqiad1.wikimedia.cloud 
(2e989cfb-5ade-4067-844a-55a9d0a96

Re: [Cloud] [Cloud-announce] Fwd: Daily VM migrations starting Monday, September 14th -- includes bastion outages

2020-09-08 Thread Andrew Bogott
My mistake!  I forwarded the wrong email -- this work is NOT happening 
today, but rather next week.


What did happen today is the domain name changes.

I apologize for the confusion.

-Andrew


On 9/8/20 9:42 AM, Andrew Bogott wrote:


These changes are now in effect.  Please let me know if you see any 
unexpected behavior.


(btw, there was just now a hardware issue in the datacenter which 
caused some bad behavior in toolforge.  That's unrelated to the naming 
change, and should be largely resolved now.)



-Andrew



 Forwarded Message 
Subject: 	Daily VM migrations starting Monday, September 14th -- 
includes bastion outages

Date:   Wed, 2 Sep 2020 11:22:29 -0500
From:   Andrew Bogott 
Reply-To:   andrewbog...@gmail.com
To: cloud-annou...@lists.wikimedia.org



tl;dr #1: Some VMs will have brief downtime the week of the 14th; 
check the lists at the bottom of this email for affected instances and 
timing.


tl;dr #2: Several bastions (including secondary-bastion.wmcloud.org) 
will be moved and rebooted at 14:00 UTC on Monday the 14th.


tl;dr #3: New ‘g2’ VM flavors will soon be available in Horizon, at 
which point you are discouraged from using the old ‘m1’ names.


tl;dr #4: Don’t let this announcement distract from the other 
important thing that’s happening: the deprecation of the .wmflabs 
domain for new VMs next week



== what's happening ==

In a few weeks we will begin moving VMs to our new storage platform, 
Ceph[0]. This move requires a full shutdown of each VM while it is 
copied over. We'll begin by evacuating our oldest hypervisors, 
cloudvirt1001-1009, two per day during the week of the 14th. Only one 
VM will be moved at a time, but the timing will be unpredictable for 
any given server.


To avoid unpredictable interruptions to ongoing work, I'm going to 
move the following bastion hosts first, at 14:00 UTC (that's 7AM 
Pacific time) on Monday the 14th.  Those bastions are:


- bastion-eqiad1-02 (AKA 
instance-bastion-eqiad1-02.bastion.wmflabs.org AKA 
instance-bastion-eqiad1-02.bastion.wmcloud.org aka 
secondary.bastion.wmcloud.org AKA secondary.bastion.wmflabs.org)


- bastion-restricted-eqiad1-01

- tools-sgebastion-09


== before the move ==

If your VMs appear in the list below you should either plan a 
three-hour downtime on the day listed (14:00-17:00 UTC), or contact me 
on IRC to have your VM moved by hand ahead of time.


In the days preceding this move, you will see several new flavor 
options appear in the Horizon interface for new VMs.  They will have 
standard stats-based names preceded by ‘g2’, for example 
‘g2.cores1.ram2.disk80'.  These new flavors will be bound to the Ceph 
backend such that any new VMs created with those flavors will be run 
on new hypervisors and stored on the Ceph backend.  You're encouraged 
to start using these new flavors as soon as they appear.



== during the move ==

Each VM will be shutdown, copied, have its flavor adjusted, and then 
restarted. The total downtime will vary depending on the size of the 
VM but will generally be measured in minutes rather than hours.



== after the move ==

Migrated VMs will display in Horizon with a new flavor name.  The new 
flavors will have the same specs (cores, ram,  disk) as the former 
flavor but will include Ceph-specific metadata.


A side-effect of the move is that VMs from cloudvirts1001-1009 will be 
running on much newer hardware, so CPU-intensive activities should be 
quite a bit faster.  File IO will be moderately slower than before.  
If you have a workflow that is rendered impractical by the IO changes 
please open a phabricator task to discuss your options.



== eventually ==

After the dust has settled from the first round of migrations 
(probably sometime during the week of the 23rd) we will disable 
creation of new non-Ceph VMs.  That means that the old 
"m1.small"-style flavors will still display in Horizon and 
openstack-browser but only VMs marked with new ‘g2’ flavor names will 
build successfully.


Remaining VMs (on cloudvirts1012 through 1030) will be moved to Ceph 
in future weeks. Keep an eye out for emails announcing such moves.



== Schedule ==

Monday, 2020-09-14, 14:00-17:00 UTC: cloudvirt1001

cn-staging-1.centralnotice-staging.eqiad1.wikimedia.cloud 
(178d85a9-cc6f-4b61-bc3a-eaedc7e4a219)


cloud-puppetmaster-04.cloudinfra.eqiad1.wikimedia.cloud 
(5ea5ac40-43a7-42f3-b986-26f5803b89fc)


deploy-1002.devtools.eqiad1.wikimedia.cloud 
(8ffecd5f-4de6-4c89-904a-3879612da6a5)


dumps-4.dumps.eqiad1.wikimedia.cloud 
(8b8e9f64-4491-4cb0-85b1-f41e06772a2c)


pontoon-puppet-01.monitoring.eqiad1.wikimedia.cloud 
(43cdcd6e-7259-41b1-ae0d-8c4d8c1e2977)


ores-worker-02.ores.eqiad1.wikimedia.cloud 
(0ff5345d-53b5-4c75-998d-c7ee235c469e)


ores-misc-01.ores-staging.eqiad1.wikimedia.cloud 
(ee7b9541-ee56-4c83-ba6e-221a5427eb61)


wikibase-scisrc.sciencesource.eqiad1.wikimedia.cloud 
(da82ab0b-1a42-42fe-b139-77586aadef80)


techblog-puppetmaster-01.techblog.e

Re: [Cloud] [Cloud-announce] Fwd: Phasing out the .wmflabs tld on September 8th

2020-09-08 Thread Daniel Zahn
If I am currently using bastion-restricted.wmflabs.org in my ProxyCommand,
am I also supposed to replace it?
With bastion-restricted.wmcloud.org ?   And is the distinction between
regular bastion and restricted bastion still a thing?
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] [Cloud-announce] Fwd: Phasing out the .wmflabs tld on September 8th

2020-09-08 Thread Andrew Bogott

On 9/8/20 11:57 AM, Daniel Zahn wrote:
If I am currently using bastion-restricted.wmflabs.org 
 in my ProxyCommand, am I also 
supposed to replace it?
With bastion-restricted.wmcloud.org 
 ?


The modern name for that bastion is 'restricted.bastion.wmcloud.org' 
. 
Note, though, that today's change did not affect public domains, only 
private .eqiad.wmflabs domain names.  We have no plans to eliminate 
existing .wmflabs.org domain names because we don't want to break 
existing URLs.



And is the distinction between regular bastion and restricted bastion 
still a thing?


It is, although I'm not 100% convinced that it has value.





___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud



___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud