[Cloud] Mono framework upgraded in toolforge

2018-05-28 Thread Arturo Borrero Gonzalez
Hi,

since the beginning of this month we found the need to upgrade the
version of mono framework in Toolforge to something newer [0].

Affected tools/boots maintainers were aware of the upcoming changes, but
in case some of you are developing a new tool/boot, please just note the
change:

mono-complete was upgraded:
* from 3.2.8+dfsg-4ubuntu1.1
* to 5.12.0.226-0xamarin3+ubuntu1404b1

Users that were using their own mono framework versions (because ours
was old) could try now using our new.

Please, if you detect any regression, let me know, we can rollback this
if required.

[0] https://phabricator.wikimedia.org/T194665


___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] Heads up tools/bots using Mono/.NET in Toolforge/GridEngine

2018-05-31 Thread Arturo Borrero Gonzalez
We upgraded the Mono/.NET framework in Toolforge/GridEngine from the 3.x
version to 5.x [0].

We discovered that some tweaking is required due to some weird behavior
regarding memory allocation by the framework [1].
The first symptom you will see is your boot doing high CPU load (spins).

The fix is easy, just telling Mono that more memory is available when
running the tool/bot. But you require to cancel your job submissions and
resend. Please refer to the phabricator bug [1] for more details.

Sorry for the inconvenience.

[0] https://phabricator.wikimedia.org/T194665
[1] https://phabricator.wikimedia.org/T195834

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] prometheus user issue

2018-06-05 Thread Arturo Borrero Gonzalez
Hi!

We deleted the prometheus user from LDAP and created it locally [0].

This may cause puppet failures, since there is a timeframe in which the
id/gid in /var/lib/prometheus is the old LDAP one.

We are running a massive, CloudVPS-wide deluser/adduser/chown operation
to fix this.

[0] https://phabricator.wikimedia.org/T196137

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] Operation on Cloud VPS next monday 13th Aug

2018-08-07 Thread Arturo Borrero Gonzalez
Hi!

Next monday 13th we will be doing some maintenance on the main Cloud VPS
deployment to merge the keystone service of both main and eqiad1
deployments (the new one that we will eventually put into production).

Toolforge users will not be affected by this outage.

Day: Monday 13th August
Start time: 14:00 UTC
Finish time: 16:00 UTC or ASAP

Keystone is a central point in openstack, so most horizon operations
like login, creating/deleting VMs could be affected. On the other hand,
VMs will keep working and we don't expect any network outage.

This operation will allow us to have a smooth transition in the future
when we move all projects and instances to the new eqiad1 deployment and
is a previous step to having multi-region support in our Cloud VPS service.

Please let us know any question or suggestions you may have.

best regards.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] Operation on Cloud VPS next monday 13th Aug

2018-08-13 Thread Arturo Borrero Gonzalez
On 07/08/18 18:24, Arturo Borrero Gonzalez wrote:
> Hi!
> 
> Next monday 13th we will be doing some maintenance on the main Cloud VPS
> deployment to merge the keystone service of both main and eqiad1
> deployments (the new one that we will eventually put into production).
> 
> Toolforge users will not be affected by this outage.
> 
> Day: Monday 13th August
> Start time: 14:00 UTC
> Finish time: 16:00 UTC or ASAP
> 
> Keystone is a central point in openstack, so most horizon operations
> like login, creating/deleting VMs could be affected. On the other hand,
> VMs will keep working and we don't expect any network outage.
> 
> This operation will allow us to have a smooth transition in the future
> when we move all projects and instances to the new eqiad1 deployment and
> is a previous step to having multi-region support in our Cloud VPS service.
> 
> Please let us know any question or suggestions you may have.
> 

Reminder, this is happening today in 30 minutes.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] Operation on Cloud VPS next monday 13th Aug

2018-08-13 Thread Arturo Borrero Gonzalez
On 13/08/18 15:30, Arturo Borrero Gonzalez wrote:
> On 07/08/18 18:24, Arturo Borrero Gonzalez wrote:
>> Hi!
>>
>> Next monday 13th we will be doing some maintenance on the main Cloud VPS
>> deployment to merge the keystone service of both main and eqiad1
>> deployments (the new one that we will eventually put into production).
>>
>> Toolforge users will not be affected by this outage.
>>
>> Day: Monday 13th August
>> Start time: 14:00 UTC
>> Finish time: 16:00 UTC or ASAP
>>
>> Keystone is a central point in openstack, so most horizon operations
>> like login, creating/deleting VMs could be affected. On the other hand,
>> VMs will keep working and we don't expect any network outage.
>>
>> This operation will allow us to have a smooth transition in the future
>> when we move all projects and instances to the new eqiad1 deployment and
>> is a previous step to having multi-region support in our Cloud VPS service.
>>
>> Please let us know any question or suggestions you may have.
>>
> 
> Reminder, this is happening today in 30 minutes.
> 


The work has been done, and all should be working again.

Please, let us know any issue you may find.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] Ubuntu deprecation plans

2018-09-28 Thread Arturo Borrero Gonzalez
Hi!

We would like to share some information regarding Wikimedia Cloud
Services plans for deprecating Ubuntu, specially Trusty.

Ubuntu Trusty's end-of-life is April 2019 and the WMF decided to
consolidate in a single operating system, which is Debian.

In Cloud VPS, projects containing Ubuntu virtual machine instances have
been contacted by means of a Phabricator task. Toolforge users aren't
affected by this right now, because Toolforge itself is currently
running Trusty. But we are already working on the next, Debian-based,
Toolforge version.

All this information, more details (and timelines), can be found  on
this Wikitech page:

https://wikitech.wikimedia.org/wiki/News/Trusty_deprecation

Please, let us know any question or doubt you may have.

Best regards.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] Brief service interruption next Monday 2018-11-19 at 13:00 UTC

2018-11-15 Thread Arturo Borrero Gonzalez
Next monday 2018-11-19 we will be rebooting several Cloud VPS
infrastructure servers [0] for maintenance and security updates.

This is just a simple reboot of servers and we don't expect any outage
or major interruptions, but some services may be down briefly:

* Horizon and Wikitech may misbehave
* instance creation/deletion/shutdown, etc
* CI tests may stop running

Apologies in advance for any inconvenience, and please let us know any
issue you may find after these operations.

[0] cloudcontrol1003, cloudservices1003, labcontrol1001, labservices1001

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] OSM database reboot next Tuesday 2018-11-20 at 17:30 UTC

2018-11-15 Thread Arturo Borrero Gonzalez
Hi,

next Tuesday 2018-11-20 at 17:30 UTC we will be rebooting the OSM
database (part of our data services) for maintenance and security updates.

In concrete the labstore1006.eqiad.wmnet (osmdb.eqiad.wmnet) server will
be rebooted. The other server in the cluster, labstore1007.eqiad.wmnet
has been rebooted already, but we won't be doing any pre-failover for
operative reasons.

Apologies in advance for any inconvenience, and please let us know any
issue you may find after these operations.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] Brief service interruption next Monday 2018-11-19 at 13:00 UTC

2018-11-19 Thread Arturo Borrero Gonzalez
On 11/15/18 2:03 PM, Arturo Borrero Gonzalez wrote:
> Next monday 2018-11-19 we will be rebooting several Cloud VPS
> infrastructure servers [0] for maintenance and security updates.
> 
> This is just a simple reboot of servers and we don't expect any outage
> or major interruptions, but some services may be down briefly:
> 
> * Horizon and Wikitech may misbehave
> * instance creation/deletion/shutdown, etc
> * CI tests may stop running
> 
> Apologies in advance for any inconvenience, and please let us know any
> issue you may find after these operations.
> 
> [0] cloudcontrol1003, cloudservices1003, labcontrol1001, labservices1001
> 

Remember, this is happening right now.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] OSM database reboot next Tuesday 2018-11-20 at 17:30 UTC

2018-11-20 Thread Arturo Borrero Gonzalez
On 11/15/18 5:58 PM, Arturo Borrero Gonzalez wrote:
> Hi,
> 
> next Tuesday 2018-11-20 at 17:30 UTC we will be rebooting the OSM
> database (part of our data services) for maintenance and security updates.
> 
> In concrete the labstore1006.eqiad.wmnet (osmdb.eqiad.wmnet) server will
> be rebooted. The other server in the cluster, labstore1007.eqiad.wmnet
> has been rebooted already, but we won't be doing any pre-failover for
> operative reasons.
> 
> Apologies in advance for any inconvenience, and please let us know any
> issue you may find after these operations.
> 


Reminder: this is happening in ~10 minutes.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] OSM database reboot next Tuesday 2018-11-20 at 17:30 UTC

2018-11-20 Thread Arturo Borrero Gonzalez
On 11/20/18 6:19 PM, Arturo Borrero Gonzalez wrote:
> On 11/15/18 5:58 PM, Arturo Borrero Gonzalez wrote:
>> Hi,
>>
>> next Tuesday 2018-11-20 at 17:30 UTC we will be rebooting the OSM
>> database (part of our data services) for maintenance and security updates.
>>
>> In concrete the labstore1006.eqiad.wmnet (osmdb.eqiad.wmnet) server will
>> be rebooted. The other server in the cluster, labstore1007.eqiad.wmnet
>> has been rebooted already, but we won't be doing any pre-failover for
>> operative reasons.
>>
>> Apologies in advance for any inconvenience, and please let us know any
>> issue you may find after these operations.
>>
> 
> 
> Reminder: this is happening in ~10 minutes.
> 

We are done! Please report any issue you may find.

Thanks, best regards.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] CloudVPS network maintenance 2018-11-27 @ 17:30 UTC

2018-11-21 Thread Arturo Borrero Gonzalez
Hi,

next Tuesday, 2018-11-27 @ 17:30UTC we will reboot the
labnet1001.eqiad.wmnet server for maintenance and security updates.

This server provides virtual networking services for CloudVPS in the
main deployment (the old one, different from the eqiad1 deployment).
We won't be doing any failover prior to the reboot for operative reasons
(we measured the failover downtime is longer than the actual reboot time).

The impact of this brief reboot downtime will be:

* all VMs in the main CloudVPS deployment won't have network connectivity
* ongoing network connections (downloads, uploads) will fail and will
have to be restarted
* cross connectivity between VM instances in the main and eqiad1
deployment won't be possible

Thanks for your understanding, and let us know any issues you may find
after the reboot next week.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] CloudVPS network maintenance 2018-11-27 @ 17:30 UTC

2018-11-27 Thread Arturo Borrero Gonzalez
On 11/21/18 10:54 AM, Arturo Borrero Gonzalez wrote:
> Hi,
> 
> next Tuesday, 2018-11-27 @ 17:30UTC we will reboot the
> labnet1001.eqiad.wmnet server for maintenance and security updates.
> 
> This server provides virtual networking services for CloudVPS in the
> main deployment (the old one, different from the eqiad1 deployment).
> We won't be doing any failover prior to the reboot for operative reasons
> (we measured the failover downtime is longer than the actual reboot time).
> 
> The impact of this brief reboot downtime will be:
> 
> * all VMs in the main CloudVPS deployment won't have network connectivity
> * ongoing network connections (downloads, uploads) will fail and will
> have to be restarted
> * cross connectivity between VM instances in the main and eqiad1
> deployment won't be possible
> 
> Thanks for your understanding, and let us know any issues you may find
> after the reboot next week.
> 

Reminder, this is happening in ~10 minutes.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] CloudVPS network maintenance tomorrow 2018-12-20 @ 17:00 UTC

2018-12-19 Thread Arturo Borrero Gonzalez
Hi!

Tomorrow 2018-12-20 @ 17:00 UTC (~24h from now) we will be conducting
some network maintenance in Cloud VPS (openstack).

We will be doing some works on the transport network that connects the
Neutron server to the rest of the internet. Running CloudVPS instances
will see a brief connection problem if connected to any external service
(outside CloudVPS).

If everything goes fine, according to our tests all should be fine, all
operations will be finished in just a couple of minutes.

Let us know any issue you may find. Thanks.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] CloudVPS network maintenance tomorrow 2018-12-20 @ 17:00 UTC

2018-12-20 Thread Arturo Borrero Gonzalez
On 12/19/18 6:16 PM, Arturo Borrero Gonzalez wrote:
> Hi!
> 
> Tomorrow 2018-12-20 @ 17:00 UTC (~24h from now) we will be conducting
> some network maintenance in Cloud VPS (openstack).
> 
> We will be doing some works on the transport network that connects the
> Neutron server to the rest of the internet. Running CloudVPS instances
> will see a brief connection problem if connected to any external service
> (outside CloudVPS).
> 
> If everything goes fine, according to our tests all should be fine, all
> operations will be finished in just a couple of minutes.
> 
> Let us know any issue you may find. Thanks.
> 


Reminder, this is happening now.

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] CloudVPS Trusty instances shutdown

2019-01-18 Thread Arturo Borrero Gonzalez
(list cross-posting on purpose, sorry for that)

Hi!

Today is the deadline for Ubuntu Trusty instances running in CloudVPS
[0]. We will be shutting down the remaining instances next monday
(2019-01-21) to avoid having the weekend in-between.

This situation has been communicated in the corresponding phabricator
task to the involved people. The only exception to this deadline is for
projects actively working on migrating to Debian.

For the record, affected projects are:

* queryrapi https://phabricator.wikimedia.org/T204683
* telnet https://phabricator.wikimedia.org/T204694
* wikidataconcepts https://phabricator.wikimedia.org/T204695 [1]
* wildcat https://phabricator.wikimedia.org/T204703
* design https://phabricator.wikimedia.org/T204502
* dumps https://phabricator.wikimedia.org/T204503
* maps https://phabricator.wikimedia.org/T204506 [1]
* getstarted https://phabricator.wikimedia.org/T204508
* tools/toolsbeta https://phabricator.wikimedia.org/T204530 [1]

(please check individual phabricator tasks to see which concrete VMs are
affected)

Toolforge gridengine users have a separate deprecation process, and you
may found additional information on wikitech [2].

[0] https://wikitech.wikimedia.org/wiki/News/Trusty_deprecation
[1] project seems to be actively working in a replacement, we will grant
an exception.
[2] https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Current status of Toolforge and Cloud VPS (2019-02-16)

2019-02-16 Thread Arturo Borrero Gonzalez
Hi,

Here is just a brief update on the status of Toolforge and CloudVPS by today
2019-02-16, along with some guess-estimations and what to expect in following
days. Keeping track of all the events we had this week may be complex, because
they were several of them, and heavily intermixed.

* CloudVPS suffered severe hardware issues this week [0]. We solved most of the
problems and added spare hardware [1] because our server capacity was really
lowered. This service should be mostly stable right now.

* Toolsdb (tools.db.svc.eqiad.wmflabs) is currently overloaded and suffering
from hardware errors. We are already working on a replacement for this service
[2]. Services depending on this database aren't working properly (like PAWS) and
Toolforge tools that use it are also affected.

An honest estimation is that services (specially Toolsdb) we won't be fully
recovered until at least next Tuesday (2019-02-26).

Our current plans involve replacing the Toolsdb hardware with virtual machines
inside CloudVPS [3]. We are trying to be extra cautious to prevent data loss and
other problems usually associated with doing things in a rush.

Finally, I would like to mention that we are all well aware of the importance of
these services for the community and we are doing our best to get things fixed.
Thanks for your understanding and patience.

regards

[0] https://wikitech.wikimedia.org/wiki/Incident_documentation/20190213-cloudvps
[1] CloudVPS: drain and rebuild labvirt1009 as cloudvirt1009
https://phabricator.wikimedia.org/T216239
[2] ToolsDB overload and cleanup https://phabricator.wikimedia.org/T216208
[3] Replace labsdb100[4567] with instances on cloudvirt1019 and cloudvirt1020
https://phabricator.wikimedia.org/T193264
-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Horizon and Toolsadmin issues

2019-03-21 Thread Arturo Borrero Gonzalez
Hi,

following some vandalism attempts, both Horizon and Toolsadmin are affected by a
general Oauth issue in Wikitech which prevents from proper user authentication.

Affected URLs are:
 * https://horizon.wikimedia.org/
 * https://toolsadmin.wikimedia.org/auth/login

Horizon is the web UI used to create and manage Cloud VPS.
Toolsadmin (also known as striker) is the web UI used to create and maintain
Toolforge accounts.

We have no estimation right now on when a fix will be available, but several
people are actively involved in trying to get things back to normal.

regards

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Fwd: Cron jlocal rm -f /data/project/map-of-monuments/generate.*; /usr/bin/jsub -N generate -once -quiet bash /data/project/map-of-monuments/suppo

2019-05-03 Thread Arturo Borrero Gonzalez
On 5/3/19 9:59 AM, Martin Urbanec wrote:
> Hi everyone, 
> 
> I get those mails from time to time. Is there a way to prevent them?
> 

That error doesn't sound familiar to me. Please open a Phabricator ticket with
all the information you have and we will investigate further.

regards

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Electric maintenance on 2019-05-16

2019-05-14 Thread Arturo Borrero Gonzalez
Hi!

on 2019-05-16 13:00 UTC there will be a maintenance operation in one of the
Wikimedia Foundation datacenter racks that affects 2 of our servers running
virtual machines [0]. There is a risk that this maintenance operation can result
in power loss of the servers, affecting the virtual machines running on it.
However, there is no way to know for sure if there will be any outage at all.

If you are an admin of any of the VMs in the list and you want the VM to be
reallocated into other servers previous to the operation, please get in touch
with us as soon as possible. Remember that, right now, reallocating the VM to
other server means shutting down the VM briefly.

Here is a list of affected virtual machines:

cloudvirt1028.eqiad.wmnet:
af-puppetdb01.automation-framework.eqiad.wmflabs
bastion-eqiad1-02.bastion.eqiad.wmflabs
fridolin.catgraph.eqiad.wmflabs
cloud-puppetmaster-02.cloudinfra.eqiad.wmflabs
cloudstore-dev-01.cloudstore.eqiad.wmflabs
commtech-nsfw.commtech.eqiad.wmflabs
clm-test-01.community-labs-monitoring.eqiad.wmflabs
cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs
deployment-db05.deployment-prep.eqiad.wmflabs
deployment-memc05.deployment-prep.eqiad.wmflabs
deployment-sca01.deployment-prep.eqiad.wmflabs
deployment-pdfrender02.deployment-prep.eqiad.wmflabs
ign.ign2commons.eqiad.wmflabs
integration-slave-docker-1050.integration.eqiad.wmflabs
integration-castor03.integration.eqiad.wmflabs
api.openocr.eqiad.wmflabs
osmit-umap.osmit.eqiad.wmflabs
builder-envoy.packaging.eqiad.wmflabs
jmm-buster.puppet.eqiad.wmflabs
a11y.reading-web-staging.eqiad.wmflabs
adhoc-utils01.security-tools.eqiad.wmflabs
util-abogott-stretch.testlabs.eqiad.wmflabs
canary1028-01.testlabs.eqiad.wmflabs
stretch.thumbor.eqiad.wmflabs
tools-worker-1023.tools.eqiad.wmflabs
tools-proxy-04.tools.eqiad.wmflabs
tools-docker-builder-06.tools.eqiad.wmflabs
tools-sgewebgrid-generic-0904.tools.eqiad.wmflabs
tools-sgeexec-0942.tools.eqiad.wmflabs
tools-sgeexec-0941.tools.eqiad.wmflabs
tools-sgeexec-0940.tools.eqiad.wmflabs
tools-sgeexec-0939.tools.eqiad.wmflabs
tools-sgeexec-0937.tools.eqiad.wmflabs
tools-sgeexec-0929.tools.eqiad.wmflabs
tools-sgeexec-0921.tools.eqiad.wmflabs
tools-sgeexec-0920.tools.eqiad.wmflabs
tools-sgeexec-0911.tools.eqiad.wmflabs
tools-sgeexec-0909.tools.eqiad.wmflabs
toolsbeta-proxy-01.toolsbeta.eqiad.wmflabs
vconverter-instance.videowiki.eqiad.wmflabs
perfbot.webperf.eqiad.wmflabs
wdhqs-1.wikidata-history-query-service.eqiad.wmflabs

cloudvirt1014.eqiad.wmnet:
commonsarchive-prod.commonsarchive.eqiad.wmflabs
deployment-imagescaler03.deployment-prep.eqiad.wmflabs
dumps-5.dumps.eqiad.wmflabs
dumps-4.dumps.eqiad.wmflabs
incubator-mw.incubator.eqiad.wmflabs
webperformance.integration.eqiad.wmflabs
saucelabs-01.integration.eqiad.wmflabs
integration-puppetmaster01.integration.eqiad.wmflabs
maps-puppetmaster.maps.eqiad.wmflabs
maps-wma.maps.eqiad.wmflabs
mwoffliner3.mwoffliner.eqiad.wmflabs
mwoffliner1.mwoffliner.eqiad.wmflabs
phlogiston-5.phlogiston.eqiad.wmflabs
discovery-testing-01.shiny-r.eqiad.wmflabs
snuggle-enwiki-01.snuggle.eqiad.wmflabs
canary-1014-01.testlabs.eqiad.wmflabs
tools-sgeexec-0901.tools.eqiad.wmflabs
wdqs-test.wikidata-query.eqiad.wmflabs


Toolforge won't be affected by this operation.
You can read more details about the datacenter operation itself in phabricator 
[1].

Sorry for the short notice,

regards.

[0] Cloud Services: reallocate workload from rack B5-eqiad
https://phabricator.wikimedia.org/T223148
[1] Install new PDUs into b5-eqiad https://phabricator.wikimedia.org/T223126
-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Electric maintenance on 2019-05-16

2019-05-16 Thread Arturo Borrero Gonzalez
On 5/14/19 2:16 PM, Arturo Borrero Gonzalez wrote:
> Hi!
> 
> on 2019-05-16 13:00 UTC there will be a maintenance operation in one of the
> Wikimedia Foundation datacenter racks that affects 2 of our servers running
> virtual machines [0]. There is a risk that this maintenance operation can 
> result
> in power loss of the servers, affecting the virtual machines running on it.
> However, there is no way to know for sure if there will be any outage at all.
> 

Hi!,

This has been done with no issues detected. All clear.

regards.

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Where is webchat for wikimedia-labs?

2019-05-24 Thread Arturo Borrero Gonzalez
On 5/24/19 12:44 AM, Thomas Stieve wrote:
> Hello all,
> 
> Could someone tell me where webchat for wikimedia-labs is now?
> https://webchat.freenode.net/?channels=wikimedia-labs
> 

It has been a while since we migrated to #wikimedia-cloud, and we may need to
update some leftover references in some docs.

Where did you find a reference to this channel?

regards.

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] The sql command (Was: Re: Where is webchat for wikimedia-labs?)

2019-05-24 Thread Arturo Borrero Gonzalez
On 5/24/19 8:37 AM, Valerio Bozzolan via Cloud wrote:
> Hello Thomas,
> 
> Just try to type 'mysql' instead of 'sql'. I don't know any 'sql' command.
> 
> Regards
> 
> On May 24, 2019 12:59:33 AM GMT+02:00, Maximilian Doerr 
>  wrote:
>> You may need to point the command to the location by calling an
>> absolute path.  Use “which sql” to figure out where the command is
>> located.
>>
>> Cyberpower678
>> English Wikipedia Account Creation Team
>> English Wikipedia Administrator
>> Global User Renamer
>>
>>> On May 23, 2019, at 18:57, Thomas Stieve
>>  wrote:
>>>
>>> Also, my question for the webchat was about how to run commands using
>> a bash file. I used to be able to run:
>>>
>>> sql enwiki_p 'select * from logging where log_title = "A.S._Roma" and
>> log_namespace = 0 and log_timestamp > 20160101000 and log_action =
>> "move"' > A.S._Roma.txt;
>>>
>>> Now, I just just get command not found. 
>>>

You can read more about the `sql` command here:

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Connecting_to_the_database_replicas

It's a custom wrapper aiming to ease interaction with the DB.

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] cloudservices1003 rebuild on 2019-06-03

2019-05-28 Thread Arturo Borrero Gonzalez
Hi!

On 2019-06-03 UTC+2 14:00 (next monday) we will be rebuilding the
cloudservices1003 server,
that holds the designate service which serves DNS request for CloudVPS and
Toolforge.

We have a backup server -cloudservices1004-, so we don't expect a lot of
downtime. But DNS queries are really fast, and there may be a lot of them that
will fail while we stabilize the DNS service.

Please reach out to the WMCS team if you need more details or have any doubts.

regards.

-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] cloudservices1003 rebuild on 2019-06-03

2019-06-03 Thread Arturo Borrero Gonzalez
On 5/28/19 8:11 PM, Arturo Borrero Gonzalez wrote:
> Hi!
> 
> On 2019-06-03 UTC+2 14:00 (next monday) we will be rebuilding the
> cloudservices1003 server,
> that holds the designate service which serves DNS request for CloudVPS and
> Toolforge.
> 
> We have a backup server -cloudservices1004-, so we don't expect a lot of
> downtime. But DNS queries are really fast, and there may be a lot of them that
> will fail while we stabilize the DNS service.
> 
> Please reach out to the WMCS team if you need more details or have any doubts.
> 

Just a heads up, this is starting now.


-- 
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] PDU upgrades in the eqiad datacenter affects CloudVPS hypervisor

2019-07-23 Thread Arturo Borrero Gonzalez
Hi there!

There is an ongoing maintenance in the eqiad datacenter that involves changing
power connectors of the servers. More info in this phabricator task: T226778 
[0].

The PDU upgrade could potentially leave our hypervisors without power briefly.
For some hypervisors, we plan to take the risks of leaving them running. For
some other hypervisors (those running important DBs in the form of virtual
machines) we will probably do a controlled shutdown before the operations to
ensure no data corruption happen in the databases.

The PDU upgrades will happen this very week (see phab task [0]) and it could
potentially affect every virtual machine we run in CloudVPS. This includes
Toolforge.
In the case of power loss, we expect the disruptions to be very briefly and to
don't cause extended downtime in any case.

Please, let us know any issue you may find related to this operation.

regards.

[0] https://phabricator.wikimedia.org/T226778
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] PDU upgrades in the eqiad datacenter affects CloudVPS hypervisor

2019-07-25 Thread Arturo Borrero Gonzalez
On 7/23/19 8:31 PM, Arturo Borrero Gonzalez wrote:
> Hi there!
> 
> There is an ongoing maintenance in the eqiad datacenter that involves changing
> power connectors of the servers. More info in this phabricator task: T226778 
> [0].
[..]
> 
> [0] https://phabricator.wikimedia.org/T226778
> 

This has been delayed and will be scheduled likely starting in 2 weeks.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] CloudVPS reboots for security updates

2019-07-29 Thread Arturo Borrero Gonzalez
Hi there!

Unrelated to other operations that were communicated recently (datacenter PDU
upgrades, operating system upgrades, etc) we need to reboot all the cloudvirt
servers to introduce some security updates for CPU vulnerabilities.

Along with the physical hardware reboot we also need to reboot all the virtual
machines running in CloudVPS.

This operation is a bit disruptive but very quick and should not lead to any
unexpected errors (is just a reboot). We already tried the same upgrades in some
other servers.

We will be doing the reboots during this week (starting 2019-07-29). If you see
any problems related to this, please contact us.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS reboots for security updates

2019-07-29 Thread Arturo Borrero Gonzalez
On 7/29/19 3:22 PM, Maximilian Doerr wrote:
> Aww man.  I was hoping to push 365 days of continuous up time for my VMs.
> 
> Cyberpower678
> English Wikipedia Account Creation Team
> English Wikipedia Administrator
> Global User Renamer
> 


Well, a reboot once in a while prevents other major issues :-P

Bonus: some people say that the industry standard is 100 days as the maximum
uptime you may have in your servers. Some unix tools (like htop) will warn you
if the uptime is >100 (only an asterisk though).

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Networking incident today in CloudVPS (ferm update)

2019-09-30 Thread Arturo Borrero Gonzalez
Hi,

today 2019-09-30 we were doing an operation in all CloudVPS virtual machines to
update ferm to fix a bug [0]. Ferm is a firewalling utility.

The fleet-wide operation resulted in ferm being installed in every VM, even in
those VMs not requiring it. This resulted in a network outage for most of the
virtual machines and projects that were not previously configured to use ferm.
Many Toolforge tools (webservices, grid jobs, etc) stopped working, database
connection were lost, proxy reported bad gateway errors, etc.

To resolve the issue, we quickly removed ferm from every VM and run puppet agent
to install it just in the VMs that had ferm in their puppet manifests.
As soon as we did this, everything went back to normal.
This incident lasted 1h, give or take.

Please, get in contact in case you see any issue or have any doubts about this
incident.

regards.

[0] https://phabricator.wikimedia.org/T153468
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] CloudVPS maintenance on Wednesday 2019-10-09 (round of cloudvirt reboots)

2019-10-02 Thread Arturo Borrero Gonzalez
Hi there,

Next Wednesday 2019-10-09 at 09:00 UTC we will be doing a maintenance operation
on some of our cloudvirt servers (the hypervisor servers) that involves
rebooting both the physical servers and the virtual machines running on them.
The reason is that we need to update the running linux kernel version they have.

In this window we will reboot 4 hypervisors:
* cloudvirt1008
* cloudvirt1009
* cloudvirt1012
* cloudvirt1013

The procedure will be to reboot a server, wait for it to come back online (could
take up to 5 minutes) and wait for all the VMs to come back online. Then move to
the next server.

Toolforge users may see their tools and webservices briefly disrupted due to
several components of the Toolforge infrastructure being rebooted in this 
operation.

If nothing changes (reallocated or new virtual machine, etc) this is the list of
affected VM instances in each hypervisor:

* cloudvirt1008:

VM: tools-sgebastion-09 PROJECT: tools
VM: tools-k8s-master-01 PROJECT: tools
VM: deployment-cache-upload05 PROJECT: deployment-prep
VM: toolsbeta-paws-worker-1002 PROJECT: toolsbeta
VM: toolsbeta-puppetmaster-02 PROJECT: toolsbeta
VM: tools-mail-02 PROJECT: tools
VM: tools-prometheus-02 PROJECT: tools
VM: tools-elastic-01 PROJECT: tools
VM: tracker1 PROJECT: lta-tracker
VM: tools-clushmaster-02 PROJECT: tools
VM: tools-worker-1020 PROJECT: tools
VM: tools-k8s-etcd-01 PROJECT: tools
VM: tools-worker-1010 PROJECT: tools
VM: tools-worker-1008 PROJECT: tools
VM: tools-worker-1007 PROJECT: tools
VM: tools-worker-1003 PROJECT: tools
VM: tools-sgeexec-0937 PROJECT: tools

* cloudvirt1009:

VM: toolsbeta-paws-master-01 PROJECT: toolsbeta
VM: tools-elastic-02 PROJECT: tools
VM: tools-paws-worker-1005 PROJECT: tools
VM: tools-prometheus-01 PROJECT: tools
VM: tools-paws-worker-1002 PROJECT: tools
VM: puppet-lta PROJECT: lta-tracker
VM: tools-flannel-etcd-03 PROJECT: tools
VM: tools-worker-1017 PROJECT: tools
VM: tools-k8s-etcd-02 PROJECT: tools
VM: tools-worker-1013 PROJECT: tools
VM: tools-worker-1012 PROJECT: tools
VM: tools-worker-1009 PROJECT: tools
VM: tools-worker-1006 PROJECT: tools
VM: tools-worker-1004 PROJECT: tools

* cloudvirt1012:

VM: tools-paws-master-01 PROJECT: tools
VM: deployment-ms-be06 PROJECT: deployment-prep
VM: toolsbeta-worker-1001 PROJECT: toolsbeta
VM: deployment-cumin02 PROJECT: deployment-prep
VM: toolsbeta-k8s-master-01 PROJECT: toolsbeta
VM: toolsbeta-k8s-etcd-01 PROJECT: toolsbeta
VM: toolsbeta-puppetdb-01 PROJECT: toolsbeta
VM: tools-redis-1002 PROJECT: tools
VM: tools-paws-worker-1003 PROJECT: tools
VM: tools-paws-worker-1001 PROJECT: tools
VM: tools-elastic-03 PROJECT: tools
VM: tools-worker-1025 PROJECT: tools
VM: tools-worker-1026 PROJECT: tools
VM: tools-worker-1022 PROJECT: tools
VM: tools-worker-1019 PROJECT: tools
VM: tools-worker-1018 PROJECT: tools
VM: tools-k8s-etcd-03 PROJECT: tools
VM: tools-worker-1016 PROJECT: tools
VM: tools-flannel-etcd-01 PROJECT: tools
VM: tools-worker-1014 PROJECT: tools
VM: phlogiston-5 PROJECT: phlogiston
VM: dumps-3 PROJECT: dumps
VM: codesearch4 PROJECT: codesearch
VM: wikispeech-wiki-stretch PROJECT: wikispeech
VM: ores-worker-01 PROJECT: ores
VM: puppet-jmm-kernel-stretch2 PROJECT: puppet
VM: mcr-base PROJECT: mcr-dev
VM: rel2 PROJECT: search
VM: mc-clusterA-2 PROJECT: test-twemproxy
VM: wikibrain-embeddings-02 PROJECT: wikibrain
VM: qube-node1 PROJECT: k8splay
VM: cindy PROJECT: pluggableauth
VM: cvn-apache9 PROJECT: cvn
VM: zk1-2 PROJECT: analytics

* cloudvirt1013:

VM: tools-flannel-etcd-02 PROJECT: tools
VM: paws-ext-lb-01 PROJECT: paws
VM: abogott-puppetclient PROJECT: testlabs
VM: tools-worker-1028 PROJECT: tools
VM: tools-worker-1005 PROJECT: tools
VM: cloudstore-dev-02 PROJECT: cloudstore
VM: cloudstore-puppetmaster-01 PROJECT: cloudstore
VM: deployment-aqs03 PROJECT: deployment-prep
VM: osmit-test PROJECT: osmit
VM: tools-sgewebgrid-lighttpd-0927 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0926 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0925 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0924 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0923 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0922 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0920 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0917 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0909 PROJECT: tools
VM: tools-sgeexec-0925 PROJECT: tools
VM: tools-sgeexec-0923 PROJECT: tools
VM: tools-sgeexec-0910 PROJECT: tools
VM: cyberbot-db-01 PROJECT: cyberbot


regards.
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] CloudVPS maintenance on Wednesday 2019-10-09 (round of cloudvirt reboots)

2019-10-09 Thread Arturo Borrero Gonzalez
Hi,

a remainder, this is happening now!

On 10/2/19 11:02 AM, Arturo Borrero Gonzalez wrote:
> Hi there,
> 
> Next Wednesday 2019-10-09 at 09:00 UTC we will be doing a maintenance 
> operation
> on some of our cloudvirt servers (the hypervisor servers) that involves
> rebooting both the physical servers and the virtual machines running on them.
> The reason is that we need to update the running linux kernel version they 
> have.
> 
> In this window we will reboot 4 hypervisors:
> * cloudvirt1008
> * cloudvirt1009
> * cloudvirt1012
> * cloudvirt1013
> 
> The procedure will be to reboot a server, wait for it to come back online 
> (could
> take up to 5 minutes) and wait for all the VMs to come back online. Then move 
> to
> the next server.
> 
> Toolforge users may see their tools and webservices briefly disrupted due to
> several components of the Toolforge infrastructure being rebooted in this 
> operation.
> 
> If nothing changes (reallocated or new virtual machine, etc) this is the list 
> of
> affected VM instances in each hypervisor:
> 
> * cloudvirt1008:
> 
> VM: tools-sgebastion-09 PROJECT: tools
> VM: tools-k8s-master-01 PROJECT: tools
> VM: deployment-cache-upload05 PROJECT: deployment-prep
> VM: toolsbeta-paws-worker-1002 PROJECT: toolsbeta
> VM: toolsbeta-puppetmaster-02 PROJECT: toolsbeta
> VM: tools-mail-02 PROJECT: tools
> VM: tools-prometheus-02 PROJECT: tools
> VM: tools-elastic-01 PROJECT: tools
> VM: tracker1 PROJECT: lta-tracker
> VM: tools-clushmaster-02 PROJECT: tools
> VM: tools-worker-1020 PROJECT: tools
> VM: tools-k8s-etcd-01 PROJECT: tools
> VM: tools-worker-1010 PROJECT: tools
> VM: tools-worker-1008 PROJECT: tools
> VM: tools-worker-1007 PROJECT: tools
> VM: tools-worker-1003 PROJECT: tools
> VM: tools-sgeexec-0937 PROJECT: tools
> 
> * cloudvirt1009:
> 
> VM: toolsbeta-paws-master-01 PROJECT: toolsbeta
> VM: tools-elastic-02 PROJECT: tools
> VM: tools-paws-worker-1005 PROJECT: tools
> VM: tools-prometheus-01 PROJECT: tools
> VM: tools-paws-worker-1002 PROJECT: tools
> VM: puppet-lta PROJECT: lta-tracker
> VM: tools-flannel-etcd-03 PROJECT: tools
> VM: tools-worker-1017 PROJECT: tools
> VM: tools-k8s-etcd-02 PROJECT: tools
> VM: tools-worker-1013 PROJECT: tools
> VM: tools-worker-1012 PROJECT: tools
> VM: tools-worker-1009 PROJECT: tools
> VM: tools-worker-1006 PROJECT: tools
> VM: tools-worker-1004 PROJECT: tools
> 
> * cloudvirt1012:
> 
> VM: tools-paws-master-01 PROJECT: tools
> VM: deployment-ms-be06 PROJECT: deployment-prep
> VM: toolsbeta-worker-1001 PROJECT: toolsbeta
> VM: deployment-cumin02 PROJECT: deployment-prep
> VM: toolsbeta-k8s-master-01 PROJECT: toolsbeta
> VM: toolsbeta-k8s-etcd-01 PROJECT: toolsbeta
> VM: toolsbeta-puppetdb-01 PROJECT: toolsbeta
> VM: tools-redis-1002 PROJECT: tools
> VM: tools-paws-worker-1003 PROJECT: tools
> VM: tools-paws-worker-1001 PROJECT: tools
> VM: tools-elastic-03 PROJECT: tools
> VM: tools-worker-1025 PROJECT: tools
> VM: tools-worker-1026 PROJECT: tools
> VM: tools-worker-1022 PROJECT: tools
> VM: tools-worker-1019 PROJECT: tools
> VM: tools-worker-1018 PROJECT: tools
> VM: tools-k8s-etcd-03 PROJECT: tools
> VM: tools-worker-1016 PROJECT: tools
> VM: tools-flannel-etcd-01 PROJECT: tools
> VM: tools-worker-1014 PROJECT: tools
> VM: phlogiston-5 PROJECT: phlogiston
> VM: dumps-3 PROJECT: dumps
> VM: codesearch4 PROJECT: codesearch
> VM: wikispeech-wiki-stretch PROJECT: wikispeech
> VM: ores-worker-01 PROJECT: ores
> VM: puppet-jmm-kernel-stretch2 PROJECT: puppet
> VM: mcr-base PROJECT: mcr-dev
> VM: rel2 PROJECT: search
> VM: mc-clusterA-2 PROJECT: test-twemproxy
> VM: wikibrain-embeddings-02 PROJECT: wikibrain
> VM: qube-node1 PROJECT: k8splay
> VM: cindy PROJECT: pluggableauth
> VM: cvn-apache9 PROJECT: cvn
> VM: zk1-2 PROJECT: analytics
> 
> * cloudvirt1013:
> 
> VM: tools-flannel-etcd-02 PROJECT: tools
> VM: paws-ext-lb-01 PROJECT: paws
> VM: abogott-puppetclient PROJECT: testlabs
> VM: tools-worker-1028 PROJECT: tools
> VM: tools-worker-1005 PROJECT: tools
> VM: cloudstore-dev-02 PROJECT: cloudstore
> VM: cloudstore-puppetmaster-01 PROJECT: cloudstore
> VM: deployment-aqs03 PROJECT: deployment-prep
> VM: osmit-test PROJECT: osmit
> VM: tools-sgewebgrid-lighttpd-0927 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0926 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0925 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0924 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0923 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0922 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0920 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0917 PROJECT: tools
> VM: tools-sgewebgrid-lig

[Cloud] CloudVPS maintenance on Wednesday 2019-10-16 (round 2 of cloudvirt reboots)

2019-10-09 Thread Arturo Borrero Gonzalez
VM: wikidata-misc PROJECT: wikidata-dev
VM: packaging PROJECT: thumbor
VM: neon PROJECT: rcm
VM: oxygen PROJECT: rcm
VM: hafnium PROJECT: rcm
VM: hound-app-01 PROJECT: hound
VM: mediawiki2latex PROJECT: collection-alt-renderer
VM: deployment-sca02 PROJECT: deployment-prep
VM: deployment-memc04 PROJECT: deployment-prep
VM: deployment-fluorine02 PROJECT: deployment-prep
VM: deployment-mcs01 PROJECT: deployment-prep
VM: deployment-parsoid09 PROJECT: deployment-prep
VM: deployment-sca04 PROJECT: deployment-prep
VM: deployment-kafka-jumbo-2 PROJECT: deployment-prep
VM: deployment-kafka-main-1 PROJECT: deployment-prep
VM: deployment-mediawiki-09 PROJECT: deployment-prep
VM: deployment-webperf12 PROJECT: deployment-prep
VM: deployment-deploy02 PROJECT: deployment-prep
VM: deployment-deploy01 PROJECT: deployment-prep
VM: deployment-maps04 PROJECT: deployment-prep
VM: twlight-tracker PROJECT: twl
VM: encoding02 PROJECT: video
VM: encoding03 PROJECT: video
VM: wikispeech-tts-dev PROJECT: wikispeech
VM: pub2 PROJECT: wikiapiary
VM: integration-slave-jessie-1001 PROJECT: integration
VM: ores-staging-01 PROJECT: ores-staging
VM: ve-font PROJECT: design
VM: visualeditor-test2 PROJECT: visualeditor
VM: ores-redis-02 PROJECT: ores
VM: quarry-worker-01 PROJECT: quarry
VM: fastcci-new-master PROJECT: fastcci
VM: cvn-app8 PROJECT: cvn

regards.
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS maintenance on Wednesday 2019-10-16 (round 2 of cloudvirt reboots)

2019-10-16 Thread Arturo Borrero Gonzalez
On 10/9/19 1:45 PM, Arturo Borrero Gonzalez wrote:
> Hello!
> 
> Next Wednesday 2019-10-16 at 09:00 UTC we will be doing another maintenance
> operation on some of our cloudvirts servers (the hypervisor servers) that
> involves rebooting both the physical servers and the virtual machines running 
> on
> them.
> The reasons is that we ned to update the running linux kernel version they 
> have.
> 
> In this window we will reboot 4 hypervisors:
> * cloudvirt1028
> * cloudvirt1029
> * cloudvirt1030
> 
> The procedure will be to reboot a server, wait for it to come back online 
> (could
> take up to 5 minutes) and wait for all the VMs to come back online. Then move 
> to
> the next server.
> 
> Toolforge users may see their tools and webservices briefly disrupted due to
> several components of the Toolforge infrastructure being rebooted in this 
> operation.
> 

Remainder, this is happening today in about 10 minutes!

regards
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] CloudVPS maintenance on Wednesday 2019-10-23 (round 3 of cloudvirt reboots)

2019-10-16 Thread Arturo Borrero Gonzalez
 PROJECT: tools
VM: canary1025-01 PROJECT: testlabs
VM: mathosphere PROJECT: math
VM: social-tools3 PROJECT: social-tools
VM: togetherjs PROJECT: visualeditor
VM: language-mleb-legacy PROJECT: language
VM: women-in-red PROJECT: globaleducation
VM: ntp-01 PROJECT: cloudinfra
VM: mc-clusterA-1 PROJECT: test-twemproxy
VM: wikifarm PROJECT: pluggableauth
VM: login-test PROJECT: catgraph
VM: puppenmeister PROJECT: planet

* cloudvirt1026:

VM: integration-agent-docker-1016 PROJECT: integration
VM: wikidata-new-wbterm PROJECT: wikidata-dev
VM: incubator-test PROJECT: incubator
VM: cloudinfra-internal-puppetmaster01 PROJECT: cloudinfra
VM: cloudinfra-db01 PROJECT: cloudinfra
VM: tools-checker-03 PROJECT: tools
VM: tools-static-13 PROJECT: tools
VM: wp1 PROJECT: mwoffliner
VM: pk8s PROJECT: planet
VM: arturo-k8s-test-4-1 PROJECT: openstack
VM: banner PROJECT: wikidumpparse
VM: packager01 PROJECT: packaging
VM: tools-package-builder-02 PROJECT: tools
VM: canary1026-02 PROJECT: testlabs
VM: security-checker1 PROJECT: packagist-mirror
VM: logstack02 PROJECT: security-tools
VM: logstack01 PROJECT: security-tools
VM: mediawiki2latex-large PROJECT: collection-alt-renderer
VM: tools-sge-services-03 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0928 PROJECT: tools
VM: tools-sgewebgrid-lighttpd-0921 PROJECT: tools
VM: tools-sgewebgrid-generic-0903 PROJECT: tools
VM: tools-sgeexec-0938 PROJECT: tools
VM: tools-sgeexec-0936 PROJECT: tools
VM: tools-sgeexec-0935 PROJECT: tools
VM: tools-sgeexec-0919 PROJECT: tools
VM: tools-sgeexec-0917 PROJECT: tools
VM: tools-sgeexec-0916 PROJECT: tools
VM: tools-sgeexec-0915 PROJECT: tools
VM: tools-sgeexec-0914 PROJECT: tools
VM: tools-paws-worker-1010 PROJECT: tools
VM: tools-paws-worker-1019 PROJECT: tools
VM: openstack-puppetmaster-01 PROJECT: openstack
VM: web1 PROJECT: graphql
VM: etytree-b PROJECT: etytree
VM: canary1026-01 PROJECT: testlabs
VM: db-instance PROJECT: videowiki
VM: tools-sgeexec-0906 PROJECT: tools
VM: mwoffliner5 PROJECT: mwoffliner

regards
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS maintenance on Wednesday 2019-10-16 (round 2 of cloudvirt reboots)

2019-10-16 Thread Arturo Borrero Gonzalez
On 10/16/19 12:42 PM, Zoran Dori wrote:
> Hi,
> you said 4 servers but also you said cloudvirt1028, cloudvirt1029 and
> cloudvirt1030. Where is fourth?
> 

That's a typo. Sorry for that. We are rebooting *3* cloudvirts.

Good catch! :-P

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS maintenance on Wednesday 2019-10-23 (round 3 of cloudvirt reboots)

2019-10-16 Thread Arturo Borrero Gonzalez
Follow-up:

We just discovered cloudvirt1014 doesn't require reboot, so this operation is
only for cloudvirt1025 and cloudvirt1026.

regards.

On 10/16/19 1:10 PM, Arturo Borrero Gonzalez wrote:
> Hello!
> 
> Next Wednesday 2019-10-23 at 09:00 UTC we will be doing another maintenance
> operation on some of our cloudvirts servers (the hypervisor servers) that
> involves rebooting both the physical servers and the virtual machines running 
> on
> them.
> The reasons is that we ned to update the running linux kernel version they 
> have.
> 
> In this window we will reboot 2 hypervisors:
> * cloudvirt1025
> * cloudvirt1026
> 
> The procedure will be to reboot a server, wait for it to come back online 
> (could
> take up to 5 minutes) and wait for all the VMs to come back online. Then move 
> to
> the next server.
> 
> Toolforge users may see their tools and webservices briefly disrupted due to
> several components of the Toolforge infrastructure being rebooted in this 
> operation.
> 
> If nothing changes (reallocated or new virtual machine, etc) this is the list 
> of
> affected VM instances in each hypervisor:
> 
> * cloudvirt1025:
> 
> VM: integration-agent-docker-1006 PROJECT: integration
> VM: striker-deploy04 PROJECT: striker
> VM: rec-wiki-2 PROJECT: recommendation-api
> VM: deployment-ms-fe03 PROJECT: deployment-prep
> VM: deployment-poolcounter05 PROJECT: deployment-prep
> VM: deployment-ms-be05 PROJECT: deployment-prep
> VM: readers-web-stephen PROJECT: reading-web-staging
> VM: traffic-upload-stretch PROJECT: traffic
> VM: traffic-recdns-anycast PROJECT: traffic
> VM: deployment-maps05 PROJECT: deployment-prep
> VM: gerrit-sizzle PROJECT: security-tools
> VM: tools-sgewebgrid-generic-0901 PROJECT: tools
> VM: shinken-puppetmaster-01 PROJECT: shinken
> VM: osmit-due PROJECT: osmit
> VM: deployment-acme-chief03 PROJECT: deployment-prep
> VM: meza-cindy PROJECT: pluggableauth
> VM: accounts-db4 PROJECT: account-creation-assistance
> VM: krenair-clientpackages-py3-jessie PROJECT: testlabs
> VM: deployment-sessionstore01 PROJECT: deployment-prep
> VM: paws-worker-04 PROJECT: paws
> VM: paws-ext-lb-02 PROJECT: paws
> VM: paws-int-lb-01 PROJECT: paws
> VM: paws-master-03 PROJECT: paws
> VM: paws-master-01 PROJECT: paws
> VM: language-readership PROJECT: language
> VM: wmde-wikidiff2-patched-stretch PROJECT: wikidiff2-wmde-dev
> VM: tools-sgebastion-08 PROJECT: tools
> VM: compiler1002 PROJECT: puppet-diffs
> VM: phragile-db PROJECT: phragile
> VM: cloud-puppetmaster-01 PROJECT: cloudinfra
> VM: chicotest-cappy01 PROJECT: chicotestproject
> VM: visualeditor-prototype2 PROJECT: visualeditor
> VM: programs-and-events-dashboard PROJECT: globaleducation
> VM: osmit-uno PROJECT: osmit
> VM: tools-sgewebgrid-lighttpd-0904 PROJECT: tools
> VM: canary1025-01 PROJECT: testlabs
> VM: mathosphere PROJECT: math
> VM: social-tools3 PROJECT: social-tools
> VM: togetherjs PROJECT: visualeditor
> VM: language-mleb-legacy PROJECT: language
> VM: women-in-red PROJECT: globaleducation
> VM: ntp-01 PROJECT: cloudinfra
> VM: mc-clusterA-1 PROJECT: test-twemproxy
> VM: wikifarm PROJECT: pluggableauth
> VM: login-test PROJECT: catgraph
> VM: puppenmeister PROJECT: planet
> 
> * cloudvirt1026:
> 
> VM: integration-agent-docker-1016 PROJECT: integration
> VM: wikidata-new-wbterm PROJECT: wikidata-dev
> VM: incubator-test PROJECT: incubator
> VM: cloudinfra-internal-puppetmaster01 PROJECT: cloudinfra
> VM: cloudinfra-db01 PROJECT: cloudinfra
> VM: tools-checker-03 PROJECT: tools
> VM: tools-static-13 PROJECT: tools
> VM: wp1 PROJECT: mwoffliner
> VM: pk8s PROJECT: planet
> VM: arturo-k8s-test-4-1 PROJECT: openstack
> VM: banner PROJECT: wikidumpparse
> VM: packager01 PROJECT: packaging
> VM: tools-package-builder-02 PROJECT: tools
> VM: canary1026-02 PROJECT: testlabs
> VM: security-checker1 PROJECT: packagist-mirror
> VM: logstack02 PROJECT: security-tools
> VM: logstack01 PROJECT: security-tools
> VM: mediawiki2latex-large PROJECT: collection-alt-renderer
> VM: tools-sge-services-03 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0928 PROJECT: tools
> VM: tools-sgewebgrid-lighttpd-0921 PROJECT: tools
> VM: tools-sgewebgrid-generic-0903 PROJECT: tools
> VM: tools-sgeexec-0938 PROJECT: tools
> VM: tools-sgeexec-0936 PROJECT: tools
> VM: tools-sgeexec-0935 PROJECT: tools
> VM: tools-sgeexec-0919 PROJECT: tools
> VM: tools-sgeexec-0917 PROJECT: tools
> VM: tools-sgeexec-0916 PROJECT: tools
> VM: tools-sgeexec-0915 PROJECT: tools
> VM: tools-sgeexec-0914 PROJECT: tools
> VM: tools-paws-worker-1010 PROJECT: tools
> VM: tools-paws-worker-1019 PROJECT: tools
> VM: openstack-pup

[Cloud] [Toolforge] Proxy maintenance operation next Monday 2019-10-28 @ 14:30 UTC

2019-10-21 Thread Arturo Borrero Gonzalez
Hi there!

Next Monday 2019-10-28 @ 14:30 UTC we will do a maintenance operation on
Toolforge which consists in rebuilding the main front proxy [0] used to serve
webservices. We expect this to be done within a 30 minutes window.

The operation consists on replacing the old virtual machines supporting the
proxy (currently running Debian Jessie) with more modern instances running
Debian Buster. Both Grid/Kubernetes backends are affected by this change. We
don't expect a lot of service downtime, but there is a key point in the
operation which is migrating data stored in Redis which can be tricky. The o

Examples of things affected by this change:

* Browsing Toolforge webservices
* Browsing to https://tools.wmflabs.org/
* Browsing to https://tools.wmflabs.org/admin/ (Toolforge landing page)
* Browsing PAWS (to some extent, since it shares part of the toolforge proxy)

Example of things not affected by this change:

* webservices backend operations
* SSH bastions
* grid queues, grid jobs
* wiki-replicas, toolsdb
* other CloudVPS projects

regards.

[0] https://phabricator.wikimedia.org/T235627

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS maintenance on Wednesday 2019-10-23 (round 3 of cloudvirt reboots)

2019-10-23 Thread Arturo Borrero Gonzalez
On 10/16/19 3:34 PM, Arturo Borrero Gonzalez wrote:
> Follow-up:
> 
> We just discovered cloudvirt1014 doesn't require reboot, so this operation is
> only for cloudvirt1025 and cloudvirt1026.
> 

Reminder:

this is happening in a few minutes!

regards

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] Brief ToolsDB Outage - Thursday 10/24 @11am UTC

2019-10-24 Thread Arturo Borrero Gonzalez
On 10/21/19 9:49 PM, Brooke Storm wrote:
> With a redundant power supply upgrade going on this week in the datacenter 
> that
> could affect the VM that Toolsdb runs on, we anticipate a brief outage 
> Thursday
> 10/24 @11am UTC of the mysql service to protect data in case anything goes
> wrong. This may require a restart of a tool to reconnect to the database. We 
> do
> not anticipate any worse disruptions, but if there is any disruption beyond 
> what
> is planned, a failover may be necessary, which will not include the
> non-replicated tables mentioned
> here 
> https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#ToolsDB_Backups_and_Replication
>  
> 
> The maintenance requiring this notice and action is detailed
> here https://phabricator.wikimedia.org/T227540.  The VM resides on the
> cloudvirt1019 hypervisor, which is why it is in scope.
> 
> We sincerely apologize for the short notice.
> 

Reminder, this is happening in a few minutes!

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Toolforge] Proxy maintenance operation next Monday 2019-10-28 @ 14:30 UTC

2019-10-24 Thread Arturo Borrero Gonzalez
On 10/21/19 7:56 PM, Martin Urbanec wrote:
> Is there something you missed to say?
> 
> "operation which is migrating data stored in Redis which can be tricky. The o"

That's a typo/leftover from me rewording that sentence.

Sorry for that :-)

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Toolforge] Proxy maintenance operation next Monday 2019-10-28 @ 14:30 UTC

2019-10-28 Thread Arturo Borrero Gonzalez
On 10/21/19 12:16 PM, Arturo Borrero Gonzalez wrote:
> Hi there!
> 
> Next Monday 2019-10-28 @ 14:30 UTC we will do a maintenance operation on
> Toolforge which consists in rebuilding the main front proxy [0] used to serve
> webservices. We expect this to be done within a 30 minutes window.
> 
> The operation consists on replacing the old virtual machines supporting the
> proxy (currently running Debian Jessie) with more modern instances running
> Debian Buster. Both Grid/Kubernetes backends are affected by this change. We
> don't expect a lot of service downtime, but there is a key point in the
> operation which is migrating data stored in Redis which can be tricky. The o
> 
> Examples of things affected by this change:
> 
> * Browsing Toolforge webservices
> * Browsing to https://tools.wmflabs.org/
> * Browsing to https://tools.wmflabs.org/admin/ (Toolforge landing page)
> * Browsing PAWS (to some extent, since it shares part of the toolforge proxy)
> 
> Example of things not affected by this change:
> 
> * webservices backend operations
> * SSH bastions
> * grid queues, grid jobs
> * wiki-replicas, toolsdb
> * other CloudVPS projects
> 
> regards.
> 
> [0] https://phabricator.wikimedia.org/T235627
> 


Reminder, this is happening now.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] MOSTLY COMPLETE cloud-vps maintenance Thursday, 2019-12-12

2019-12-12 Thread Arturo Borrero Gonzalez
On 12/12/19 11:50 AM, Andrew Bogott wrote:
> We are still chasing down stray issues (in particular, some of the dump and
> scratch mounts on toolforge or now wrong) but for almost all use cases things
> should be back to normal.
> 
> -Andrew
> 

We consider the operations finished. Everything has been done.

NFS, being one of the weakest components of our infra, suffered during today's
operation. We were force to reboot most of Toolforge servers, so grid jobs and
webservices in both the web grid and kubernetes have most likely been restarted
and may present error log entries corresponding to the window of this operation.

Other CloudVPS projects users of NFS (dumps shares, maps, etc) might also
require some checking. Please get in touch if you are a project admin of such
project.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] cloud-vps maintenance Tuesday, 2019-01-14

2020-01-14 Thread Arturo Borrero Gonzalez
On 1/7/20 6:12 AM, Andrew Bogott wrote:
> We'll be upgrading the cloud services OpenStack install next Tuesday, 
> beginning
> at 12:00 noon UTC
> 
> The entire upgrade process may take an hour or two.  Early on in the process,
> Horizon (and associated OpenStack APIs) will be disabled (probably for 20 to 
> 30
> minutes.)  There may also be brief network interruptions during the upgrade.
> 
> Toolforge and existing VMs should be largely unaffected apart from possible
> network hiccups.
> 

Reminder,

this will be happening in about 30 minutes!

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Webservice down

2020-01-22 Thread Arturo Borrero Gonzalez
On 1/22/20 11:32 AM, David Richfield wrote:
> Hi all,
> 
> The parliament diagram tool (
> https://tools.wmflabs.org/parliamentdiagram/parlitest.php ) is down.
> Last time it happened was a week ago: I just restarted the webservice
> like Alex did, but now it's down again and I'm at work, so I can't log
> in for the next six hours or so. Can someone restart it for me?
> 

here you go!

tools.parliamentdiagram@tools-sgebastion-07:~$ webservice status
Your webservice is not running

tools.parliamentdiagram@tools-sgebastion-07:~$ webservice start
Starting webservice...

tools.parliamentdiagram@tools-sgebastion-07:~$ webservice status
Your webservice of type lighttpd is running



> Also, how can I find out why it keeps going down?
> 
Try inspecting log files. For example:

tools.parliamentdiagram@tools-sgebastion-07:~$ wc -l error.log
283822 error.log

You have plenty of information there, including some "funny" things like:

Traceback (most recent call last):
  File "/data/project/parliamentdiagram/public_html//westminster.py", line 81,
in 
sumdelegates['left'],
sumdelegates['right'])/float(optionlist['wingrows']['left'])))
ZeroDivisionError: float division by zero




Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] [Toolforge] 2020 Kubernetes cluster automatic migration phase beginning

2020-02-24 Thread Arturo Borrero Gonzalez
On 2/21/20 5:14 PM, Arthur Smith wrote:
> One question - I seem to be getting some more timeout-related 500 server 
> errors.
> Was there a change in how that is handled somehow (i.e. reduced time limit for
> response from the server)? I realize it's good practice to respond quickly, 
> just
> some of the existing cases don't at the moment and I'm hitting them 
> occasionally.
> 

There are at least 3 proxies involved in serving Toolforge webservices requests:

1) tool main front proxy (dynamicproxy) (http)
2) kubernetes front haproxy (tcp)
3) kubernetes nginx-ingress (http) and perhaps kube-proxy (tcp)

More information here:
https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress

This is to say, yes, serving your request as soon as possible should help the
different proxy connections to don't die and work smoothly.

As of this email, we don't have any particular metrics or insights on proxies
performances and this is something we could explore in the near future (create a
specific grafana dashboard or something).

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] [Toolforge] 2020 Kubernetes cluster automatic migration phase beginning

2020-02-24 Thread Arturo Borrero Gonzalez
On 2/23/20 8:51 PM, Arthur Smith wrote:
> Actually I am beginning to suspect the 500 server errors are caused by an
> out-of-memory condition. Do the new kubernetes containers have lower memory
> usage limits than the old ones?
> 

Yes, you are right:

https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration#Lower_default_resource_limits_for_webservice

hope that helps.

regards.


-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Changes to CloudVPS web proxy (XFF) on 2020-04-15

2020-04-01 Thread Arturo Borrero Gonzalez
Hi there!

If you use a CloudVPS web proxy, this email is for you. Toolforge
developers/users can ignore this email.

We are introducing a change to eliminate the 'X-Forwarded-For' HTTP header that
the CloudVPS web proxy adds when forwarding the HTTP request to your instance.
This header contains the original IP address of the internet client that sent
the request. This is private information that we would like to reduce in our
environment [0].

You use the web proxy if you have a public web endpoint hosted in CloudVPS under
the wmflabs.org domain. These are generally configured using Horizon in the DNS
> Web Proxies section.

Examples of web proxy names:
 * accounts.wmflabs.org
 * glampipe.wmflabs.org
 * incubator.wmflabs.org

Full list can be seen in the Openstack Browser tool [1].

We are ready to introduce this change [2], but wanted to give some heads up for
projects that do require this information for whatever reason. We would like to
hear from you in the next couple of weeks. Please contact us in the phabricator
task [0] and include some rationale why you need the XFF header.

This is the timeline this change will follow:

* 2020-04-01: this email, start collecting list of things that require XFF
* 2020-04-07: start evaluating list of things that require XFF
* 2020-04-15: introduce the change, with proper case whitelisting

When the change is introduced, in two weeks from now, proxy backends that were
not whitelisted will stop receiving the XFF header.

Please reach out for any questions or comments.

regards.

[0] https://phabricator.wikimedia.org/T135046
[1] https://openstack-browser.toolforge.org/project/project-proxy
[2] https://gerrit.wikimedia.org/r/c/operations/puppet/+/583098

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] CloudVPS network change (routing source IP) on 2020-04-13

2020-04-06 Thread Arturo Borrero Gonzalez
Hi there!

In a few days from now (2020-04-13), the CloudVPS network will see a change
happening that will likely go unnoticed, but it is important enough to share it
with you beforehand.

We will be changing the IPv4 address that we use as the main source NAT for
egress connections (initiated in the VM instances). This change won't affect VM
instances using floating IPs.

Old IP address: 185.15.56.1
New IP address: 208.80.155.92

If you know of anywhere (a firewall, ACL or any other mechanism) that had this
address hardcoded, you will need to update it.

See this wikitech page for more details:

https://wikitech.wikimedia.org/wiki/News/CloudVPS_NAT_change

Please reach out if you have any doubts, questions, or any other issue.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] CloudVPS network change (routing source IP) on 2020-04-13

2020-04-09 Thread Arturo Borrero Gonzalez
On 4/6/20 8:00 PM, Arturo Borrero Gonzalez wrote:
> Hi there!
> 
> In a few days from now (2020-04-13), the CloudVPS network will see a change
> happening that will likely go unnoticed, but it is important enough to share 
> it
> with you beforehand.
> 
> We will be changing the IPv4 address that we use as the main source NAT for
> egress connections (initiated in the VM instances). This change won't affect 
> VM
> instances using floating IPs.
> 
> Old IP address: 185.15.56.1
> New IP address: 208.80.155.92
> 
> If you know of anywhere (a firewall, ACL or any other mechanism) that had this
> address hardcoded, you will need to update it.
> 
> See this wikitech page for more details:
> 
> https://wikitech.wikimedia.org/wiki/News/CloudVPS_NAT_change
> 

Finally, it has been decided this change will not happen.

You can safely ignore the information that was initially shared.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Toolforge: new domain toolforge.org

2020-04-13 Thread Arturo Borrero Gonzalez
Hi!

We are happy to announce the new domain 'toolforge.org' is now ready to be
adopted by our Toolforge community.

There is a lot of information related to this change in a wikitech page we have
for this:

https://wikitech.wikimedia.org/wiki/News/Toolforge.org

The most important change you will see happening is a new domain/scheme for
Toolforge-hosted webservices:

* from https://tools.wmflabs.org//
* to   https://.toolforge.org/

A live example of this change can be found in our internal openstack-browser
webservice tool:

* legacy URL: https://tools.wmflabs.org/openstack-browser/
* new URL:https://openstack-browser.toolforge.org

This domain change is something we have been working on for months previous to
this announcement. Part of our work has been to ensure we have a smooth
transition from the old domain (and URL scheme) to the new canonical one.
However, we acknowledge the ride might be bumpy for some folks, due to technical
challenges or cases we didn't consider when planning this migration. Please
reach out intermediately if you find any limitation or failure anywhere related
to this change. The wikitech page also contains a section with information for
common problems.

You can check now if your webservice needs any specific change by creating a
temporal redirection to the new canonical URL:

$ webservice --canonical --backend=kubernetes start [..]
$ webservice --canonical --backend=gridengine start [..]

The --canonical switch will create a temporal redirect that you can turn on/off.
Please use this to check how your webservice behaves with the new domain/URL
scheme. If you start the webservice without --canonical, the temporal redirect
will be removed.

We aim to introduce permanent redirects for the legacy URLs on 2020-06-15. We
expect to keep serving legacy URLs forever, by means of redirections to the new
URLs. More information on the redirections can also be found in the wikitech 
page.

The toolforge.org domain is finally here! <3

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-04-13 Thread Arturo Borrero Gonzalez
On 4/13/20 4:18 PM, Maarten Dammers wrote:
> We sure like to rename and move around. I hope Toolforge.org lasts a lot 
> longer!
> 

Hi Maarten,

I think I understand your concern. Sometimes, naming things is hard :-)

However, let me point out that toolserver and toolforge, while similar in spirit
and scope, are different services, with different technologies involved, and
more things to offer to the users (developers).

The new domain, for me, means the service is evolving even more, in the good 
sense.

Also, please note the change is not only a cosmetic one. It involves a more
secure approach to host each tool webservice, from an all-shared domain to a
domain per tool.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] [Cloud-announce] Changes to CloudVPS web proxy (XFF) on 2020-04-15

2020-04-14 Thread Arturo Borrero Gonzalez
On 4/14/20 6:25 PM, Jason Sherman wrote:
> Hi there,
> 
> I was wondering if you were planning on exposing some kind of rate-limiting
> option for the web proxies in horizon? I'm thinking this will effectively mean
> no more rate-limiting per remote address at the instance level. Every once in 
> a
> while, our project gets hammered by script kiddies and our application service
> gets brought down. I've gone ahead and implemented rate limiting in nginx that
> has a very high limit set across all ip addresses that should basically work,
> but typically I would set the limits to be per-client-ip to the extent allowed
> by the practicalities of NAT. This is not a blocker in any way for us, and I'd
> rather make do with less user info wherever possible.
> 

Hi there!

What you did seems correct to me, that is, implementing the controls on your own
servers.

That being said, I understand your concern. We have mechanisms in place for
banning concrete abusers. If we detected a more wide-spread problems we could
introduce other mechanisms and controls to ensure service availability.

Should you detect someone is hammering your servers in CloudVPS, please contact 
us.

regards.
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Changes to CloudVPS web proxy (XFF) on 2020-04-15

2020-04-15 Thread Arturo Borrero Gonzalez
On 4/1/20 2:16 PM, Arturo Borrero Gonzalez wrote:
> Hi there!
> 
> If you use a CloudVPS web proxy, this email is for you. Toolforge
> developers/users can ignore this email.
> 
> We are introducing a change to eliminate the 'X-Forwarded-For' HTTP header 
> that
> the CloudVPS web proxy adds when forwarding the HTTP request to your instance.
> This header contains the original IP address of the internet client that sent
> the request. This is private information that we would like to reduce in our
> environment [0].
> 
> You use the web proxy if you have a public web endpoint hosted in CloudVPS 
> under
> the wmflabs.org domain. These are generally configured using Horizon in the 
> DNS
>> Web Proxies section.
> 
> Examples of web proxy names:
>  * accounts.wmflabs.org
>  * glampipe.wmflabs.org
>  * incubator.wmflabs.org
> 
> Full list can be seen in the Openstack Browser tool [1].
> 
> We are ready to introduce this change [2], but wanted to give some heads up 
> for
> projects that do require this information for whatever reason. We would like 
> to
> hear from you in the next couple of weeks. Please contact us in the 
> phabricator
> task [0] and include some rationale why you need the XFF header.
> 
> This is the timeline this change will follow:
> 
> * 2020-04-01: this email, start collecting list of things that require XFF
> * 2020-04-07: start evaluating list of things that require XFF
> * 2020-04-15: introduce the change, with proper case whitelisting
> 
> When the change is introduced, in two weeks from now, proxy backends that were
> not whitelisted will stop receiving the XFF header.
> 
> Please reach out for any questions or comments.
> 
> regards.
> 
> [0] https://phabricator.wikimedia.org/T135046
> [1] https://openstack-browser.toolforge.org/project/project-proxy
> [2] https://gerrit.wikimedia.org/r/c/operations/puppet/+/583098
> 

Hi there!

This change is being applied now!

regards.
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-05-04 Thread Arturo Borrero Gonzalez
Hi there!

This is a reminder about the ongoing migration for the new domain and URL/path
scheme for webservices running in Toolforge. On 2020-05-31 we have the soft
deadline for this migration period.

For trying the change you only need to run the webservice command with the
--canonical argument. Please review the documentation here:

https://wikitech.wikimedia.org/wiki/News/Toolforge.org

Early adopting this change is interesting for many reasons, specially a more
secure environment by means of proper domain isolation. Also, the new domain
better reflects the identity of the Toolforge service :-)
Moreover, during the compatibility period, we would like to collect feedback and
bug reports from our users before the soft deadline.

As of today, we have 40 tool webservices that are running in the new domain and
using the new path scheme, find them here:

https://replag.toolforge.org
https://testwikis.toolforge.org
https://meetingtimes.toolforge.org
https://speedpatrolling.toolforge.org
https://wiki-tennis.toolforge.org
https://xslack.toolforge.org
https://urbanecm-test-1.toolforge.org
https://wdmm.toolforge.org
https://quickcategories.toolforge.org
https://wikiportretdev.toolforge.org
https://zppixbot-test.toolforge.org
https://anticompositetools.toolforge.org
https://james.toolforge.org
https://bd808-ruby.toolforge.org
https://ytcleaner.toolforge.org
https://wd-shex-infer.toolforge.org
https://github-pr-closer.toolforge.org
https://wikistream.toolforge.org
https://secwatch.toolforge.org
https://giftbot.toolforge.org
https://pagepile-visual-filter.toolforge.org
https://zppixbot.toolforge.org
https://stashbot.toolforge.org
https://moedata.toolforge.org
https://sal.toolforge.org
https://covid-obit.toolforge.org
https://docker-registry.toolforge.org
https://wb2rdf.toolforge.org
https://massmailer.toolforge.org
https://wd-image-positions.toolforge.org
https://versions.toolforge.org
https://lexeme-forms.toolforge.org
https://ukbot.toolforge.org
https://machtsinn.toolforge.org
https://bikeshed.toolforge.org
https://gmt.toolforge.org
https://ipcheck.toolforge.org
https://signatures.toolforge.org
https://templatedata-filler.toolforge.org
https://wordcount.toolforge.org

Please reach out for any comments, doubts or questions.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] Toolforge grid now using tesseract-ocr 4.1.1

2020-05-20 Thread Arturo Borrero Gonzalez
Hi there!

We just deployed tesseract-ocr v4.1.1 in the Toolforge grid.
The context of this update is the phabricator task T247422 [0].

Please report any issue you may find.

regards!

[0] https://phabricator.wikimedia.org/T247422
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-06-02 Thread Arturo Borrero Gonzalez
Hi there!

The soft deadline for migrating to the toolforge.org domain was two days ago, on
2020-05-31.

For adopting the change in a controlled way you only need to run the webservice
command with the --canonical argument. Please review the documentation here:

https://wikitech.wikimedia.org/wiki/News/Toolforge.org

Next event is the hard deadline. In about 2 weeks (on 2020-06-15) we will
introduce forced redirection from the legacy URL to the new one.

Please contact your fellow tool developers if you think they aren't aware of
this migration. And please reach out to us if you have doubts or need help.

I checked today, and at least about 110 webservices are already running on the
new domain and URL scheme:

https://alphatest.toolforge.org/
https://anticompositetools.toolforge.org/
https://author-disambiguator.toolforge.org/
https://base-encode.toolforge.org/
https://bd808-ruby.toolforge.org/
https://bd808-test2.toolforge.org/
https://bd808-test.toolforge.org/
https://bikeshed.toolforge.org/
https://bookreader.toolforge.org/
https://cdnjs-beta.toolforge.org/
https://cdnjs.toolforge.org/
https://chie-bot.toolforge.org/
https://copypatrol.toolforge.org/
https://covid-obit.toolforge.org/
https://dna.toolforge.org/
https://docker-registry.toolforge.org/
https://event-streams.toolforge.org/
https://fastilybot-reports.toolforge.org/
https://fist.toolforge.org/
https://flickr2commons.toolforge.org/
https://flickrdash.toolforge.org/
https://fontcdn.toolforge.org/
https://fountain-test.toolforge.org/
https://fountain.toolforge.org/
https://ftools.toolforge.org/
https://giftbot.toolforge.org/
https://github-pr-closer.toolforge.org/
https://global-search-test.toolforge.org/
https://global-search.toolforge.org/
https://globalsearch.toolforge.org/
https://gmt.toolforge.org/
https://grantmetrics.toolforge.org/
https://hgztools.toolforge.org/
https://ia-upload.toolforge.org/
https://indic-wscontest.toolforge.org/
https://indic-wsstats.toolforge.org/
https://interaction-timeline.toolforge.org/
https://intersect-contribs.toolforge.org/
https://ipcheck.toolforge.org/
https://ip-range-calc.toolforge.org/
https://itwikinews-rss.toolforge.org/
https://james.toolforge.org/
https://k8s-status.toolforge.org/
https://langviews.toolforge.org/
https://ldap.toolforge.org/
https://lexeme-forms.toolforge.org/
https://machtsinn.toolforge.org/
https://majavah-bot.toolforge.org/
https://massmailer.toolforge.org/
https://meetingtimes.toolforge.org/
https://mix-n-match.toolforge.org/
https://moedata.toolforge.org/
https://morfeusz.toolforge.org/
https://musikanimal.toolforge.org/
https://mwph-api.toolforge.org/
https://mwversion.toolforge.org/
https://mysql-php-session-test.toolforge.org/
https://pagepile-visual-filter.toolforge.org/
https://pageviews-test.toolforge.org/
https://pageviews.toolforge.org/
https://pathoschild-contrib.toolforge.org/
https://phabsearchemail.toolforge.org/
https://phabulous.toolforge.org/
https://plagiabot.toolforge.org/
https://plnode.toolforge.org/
https://qrcode-generator.toolforge.org/
https://quickcategories.toolforge.org/
https://replacer.toolforge.org/
https://replag.toolforge.org/
https://sal.toolforge.org/
https://searchsbl.toolforge.org/
https://section-links.toolforge.org/
https://secwatch.toolforge.org/
https://signatures.toolforge.org/
https://siteviews.toolforge.org/
https://speedpatrolling.toolforge.org/
https://sql-optimizer.toolforge.org/
https://stashbot.toolforge.org/
https://superzerocool.toolforge.org/
https://svgtranslate-test.toolforge.org/
https://svgtranslate.toolforge.org/
https://taxoboxalyzer.toolforge.org/
https://templatedata-filler.toolforge.org/
https://testwikis.toolforge.org/
https://text2hash.toolforge.org/
https://tool-db-usage.toolforge.org/
https://toolviews.toolforge.org/
https://ukbot.toolforge.org/
https://urbanecm-test-1.toolforge.org/
https://url-converter.toolforge.org/
https://versions.toolforge.org/
https://wb2rdf.toolforge.org/
https://wd-image-positions.toolforge.org/
https://wdmm.toolforge.org/
https://wd-shex-infer.toolforge.org/
https://wikicontrib.toolforge.org/
https://wikidata-externalid-url.toolforge.org/
https://wikifile-transfer.toolforge.org/
https://wikiportretdev.toolforge.org/
https://wikisource-bot.toolforge.org/
https://wikistream.toolforge.org/
https://wiki-tennis.toolforge.org/
https://wiki-topic.toolforge.org/
https://wordcount.toolforge.org/
https://wsexport.toolforge.org/
https://xn--dk8hv9g.toolforge.org/
https://xslack.toolforge.org/
https://xtools.toolforge.org/
https://ytcleaner.toolforge.org/
https://zppixbot.toolforge.org/

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-06-18 Thread Arturo Borrero Gonzalez
Hi there!

The hard deadline for migrating to the toolforge.org domain was 3 days ago, on
2020-06-15.

We are aware of some folks from the community still working on finishing up this
migration, and we will give an additional 2 weeks before introducing the
legacy-redirector that will force-redirect all the legacy URLs to the new domain
and URL scheme.

If you need additional context about this migration, please read:

https://wikitech.wikimedia.org/wiki/News/Toolforge.org

We are tracking missing webservices OAuth grants for for the new domain in this
phabricator task:

https://phabricator.wikimedia.org/T254857

If your tool is unchecked, it means it requires additional work to make sure
OAuth will work with the new domain.

Please contact your fellow tool developers if you think they aren't aware of
this migration. And please, reach out to us if you have doubts or need help.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Screen sessions

2020-06-23 Thread Arturo Borrero Gonzalez
On 2020-06-23 16:47, Isaac Johnson wrote:
> I'm interested in running some long-ish scripts that loop through the dump
> replicas on Toolforge. Eventually, this sort of thing might move to crontab, 
> but
> for now it would be nice to run a screen session as we test / debug the 
> scripts.
> The problem is that if I run the scripts from my tool account (i.e. after
> "become "), I get the following error: Cannot open your terminal
> '/dev/pts/28' - please check.
> 

You shouldn't run such script on the bastions.

The grid is the way to go in this case:

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Grid

Run your script with jsub and it will be scheduled in a grid worker node to run
until it finishes.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] [Cloud-announce] Toolforge email server now enforcing ratelimiting

2020-06-24 Thread Arturo Borrero Gonzalez
Hi,

we just enabled email ratelimiting in our MTA server [0] in Toolforge.
Please, report any problem or issue you may find related to this.

The current limit is 100 messages per hour per sender address. We may tune the
value as we observe the behavior of the system and the users.

regards.

[0] https://en.wikipedia.org/wiki/Message_transfer_agent

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-07-06 Thread Arturo Borrero Gonzalez
Hi there!

Tomorrow 2020-07-06 at about 10:00 UTC we will enable the legacy redirector and
this migration will be completed.

All requests to tools.wmflabs.org/ will be permanently redirected to
.toolforge.org.

If you need additional context about this, please read:

https://wikitech.wikimedia.org/wiki/News/Toolforge.org

Please reach out if you need help or have doubts.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Re: [Cloud] Toolforge: new domain toolforge.org

2020-07-07 Thread Arturo Borrero Gonzalez
On 2020-07-06 17:59, Arturo Borrero Gonzalez wrote:
> Hi there!
> 
> Tomorrow 2020-07-06 at about 10:00 UTC we will enable the legacy redirector 
> and
> this migration will be completed.
> 
> All requests to tools.wmflabs.org/ will be permanently redirected to
> .toolforge.org.
> 
> If you need additional context about this, please read:
> 
> https://wikitech.wikimedia.org/wiki/News/Toolforge.org
> 
> Please reach out if you need help or have doubts.
> 

This has been done!

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

[Cloud] unscheduled keystone maintenance

2020-10-06 Thread Arturo Borrero Gonzalez
Hi there,

we need to perform some unscheduled keystone maintenance right now.

Authentication to some cloud services, in particular Horizon, might be
interrupted during this maintenance period. We expect such maintenance to don't
last more than 1h.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] General CloudVPS network maintenance on 2020-10-29

2020-10-22 Thread Arturo Borrero Gonzalez
Hi!

There will be a general CloudVPS network maintenance on 20202-10-29, from 16:00
UTC to 17:00 UTC.

During the operation window, all cloud services might be intermittently down,
inaccessible.

This operation affects all CloudVPS projects, including Toolforge, PAWS and
Quarry. Services running in the cloud might fail to contact external entities,
and connections to ToolsDB, NFS, wiki-replicas or LDAP might be affected as 
well.

In the best case scenario, the changes (and downtime) will be barely noticed.
The maintenance consist on introducing new hardware equipment in to the CloudVPS
edge network. You can find additional details in Phabricator [0].

regards.

[0] https://phabricator.wikimedia.org/T265288
-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] General CloudVPS network maintenance on 2020-10-29

2020-10-29 Thread Arturo Borrero Gonzalez
On 2020-10-22 17:41, Arturo Borrero Gonzalez wrote:
> Hi!
> 
> There will be a general CloudVPS network maintenance on 20202-10-29, from 
> 16:00
> UTC to 17:00 UTC.
> 
> During the operation window, all cloud services might be intermittently down,
> inaccessible.
> 
> This operation affects all CloudVPS projects, including Toolforge, PAWS and
> Quarry. Services running in the cloud might fail to contact external entities,
> and connections to ToolsDB, NFS, wiki-replicas or LDAP might be affected as 
> well.
> 
> In the best case scenario, the changes (and downtime) will be barely noticed.
> The maintenance consist on introducing new hardware equipment in to the 
> CloudVPS
> edge network. You can find additional details in Phabricator [0].
> 
> regards.
> 
> [0] https://phabricator.wikimedia.org/T265288
> 

Reminder, this is happening now.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] General CloudVPS network maintenance on 2020-10-29

2020-10-29 Thread Arturo Borrero Gonzalez
On 2020-10-29 16:59, Arturo Borrero Gonzalez wrote:
> On 2020-10-22 17:41, Arturo Borrero Gonzalez wrote:
>> Hi!
>>
>> There will be a general CloudVPS network maintenance on 20202-10-29, from 
>> 16:00
>> UTC to 17:00 UTC.
>>
>> During the operation window, all cloud services might be intermittently down,
>> inaccessible.
>>
>> This operation affects all CloudVPS projects, including Toolforge, PAWS and
>> Quarry. Services running in the cloud might fail to contact external 
>> entities,
>> and connections to ToolsDB, NFS, wiki-replicas or LDAP might be affected as 
>> well.
>>
>> In the best case scenario, the changes (and downtime) will be barely noticed.
>> The maintenance consist on introducing new hardware equipment in to the 
>> CloudVPS
>> edge network. You can find additional details in Phabricator [0].
>>
>> regards.
>>
>> [0] https://phabricator.wikimedia.org/T265288
>>
> 
> Reminder, this is happening now.
> 

The operation is now completed.
There was a brief interruption of the service, but should be recovered now.

Let us know if you see anything weird matching the timing or somehow related to
this operation window.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] General CloudVPS network maintenance on 2020-10-29

2020-10-30 Thread Arturo Borrero Gonzalez
On 2020-10-30 00:04, Maarten Dammers wrote:
> Hi Arturo,
> 
> On 29-10-2020 18:30, Arturo Borrero Gonzalez wrote:
>> Let us know if you see anything weird matching the timing or somehow related 
>> to
>> this operation window.
> 
> This was announced as network maintenance, but the tools-sgebastion-08 
> rebooted
> at Thu Oct 29 17:15. Is this related or did the server happen to crash in the
> same window?
> 

Hi there,

due to the network maintenance, we had NFS issues on some VMs, particularly
affecting our Grid service.

So indeed we rebooted a couple of servers (including tools-sgebastion-08) to get
to a stable situation and make sure everything was working as expected.

thanks for double checking.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] General CloudVPS network maintenance on 2020-11-09 (today)

2020-11-09 Thread Arturo Borrero Gonzalez
Hi!,

There will be a general CloudVPS network maintenance on 2020-10-09 @ 12:30 UTC.
The operation window will last for 1h. During the operation, all cloud services
will be inaccessible or intermittently down.

This operation affects all CloudVPS projects, including Toolforge, PAWS and
Quarry. Services running in the cloud might fail to contact external entities,
and connections to ToolsDB, NFS, wiki-replicas or LDAP will be affected as well.

The operation we are doing today is a followup to what we did two weeks ago [0],
and involves changing the IP addressing of the network that connects the
CloudVPS network to the internet.

Sorry for the short notice, we couldn't avoid scheduling this to today.

regards.

[0] https://phabricator.wikimedia.org/T265288

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] PAWS kubernetes upgrade

2020-11-30 Thread Arturo Borrero Gonzalez
Hi there,

we are about to upgrade the kubernetes version that runs PAWS, from 1.6 to 1.17.
We don't expect any interruptions major on the service, perhaps only some
hiccups when pods are restarted/rescheduled.

More information is available in this phabricator ticket:
 https://phabricator.wikimedia.org/T268669

The operation may take something between 30 minutes and 1 hours, and we are
starting soon after I finish sending this email.

Please, ping us if you see anything wrong.

regards.

-- 
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] Toolforge kubernetes maintenance today 2020-12-10 @ 15:30 UTC

2020-12-10 Thread Arturo Borrero Gonzalez

Hi there!

Today 2020-12-10 @ 15:30 UTC we will perform an upgrade of the Toolforge 
kubernetes cluster [0].


We don't expect any major disruption of the service, but we detected in past 
upgrades that some components might be restarted, causing brief interruptions of 
network flows.


Given the amount of worker nodes we have, more than 50, the operation will take 
us at least a couple of hours.


Tools maintainers: you don't have to do anything during this operation, but if 
you detect anything weird please contact us either in the phabricator task [0], 
in the IRC channel #wikimedia-cloud or in the cloud@lists.wikimedia.org [1] 
mailing list.


regards.

[0] https://phabricator.wikimedia.org/T263284
[1] https://lists.wikimedia.org/mailman/listinfo/cloud
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] Toolforge kubernetes maintenance today 2020-12-10 @ 15:30 UTC

2020-12-10 Thread Arturo Borrero Gonzalez

On 12/10/20 1:33 PM, Arturo Borrero Gonzalez wrote:

Hi there!

Today 2020-12-10 @ 15:30 UTC we will perform an upgrade of the Toolforge 
kubernetes cluster [0].


We don't expect any major disruption of the service, but we detected in past 
upgrades that some components might be restarted, causing brief interruptions of 
network flows.


Given the amount of worker nodes we have, more than 50, the operation will take 
us at least a couple of hours.


Tools maintainers: you don't have to do anything during this operation, but if 
you detect anything weird please contact us either in the phabricator task [0], 
in the IRC channel #wikimedia-cloud or in the cloud@lists.wikimedia.org [1] 
mailing list.


regards.

[0] https://phabricator.wikimedia.org/T263284
[1] https://lists.wikimedia.org/mailman/listinfo/cloud


This is starting right now!

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] Change to how Cloud VPS and Toolforge contact Wikis

2021-01-25 Thread Arturo Borrero Gonzalez

Hello,

we are planning to change how Cloud VPS instances and Toolforge tools contact 
WMF-hosted wikis, in particular the source IP address for the network connection.

The new IP address that wikis will see is 185.15.56.1.

The change is scheduled to go live on 2021-02-08.

More detailed information in wikitech:

 https://wikitech.wikimedia.org/wiki/News/CloudVPS_NAT_wikis

If you are a Cloud VPS user or Toolforge developer, check your tools after that 
date to make sure they are properly running. If you detect a block, a rate-limit 
or similar, please let us know.


If you are a WMF SRE or engineer involved with the wikis, be informed that this 
address could generate a significant traffic volume, perhaps about 30%-40% total 
wiki edits. We are trying to smooth the change as much as possible, so please 
send your feedback if you think there is something we didn't account for yet.


Thanks, best regards.
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] [Ops] Change to how Cloud VPS and Toolforge contact Wikis

2021-01-29 Thread Arturo Borrero Gonzalez

On 1/28/21 9:50 PM, Martin Urbanec wrote:

Hi Arturo,

a quick question: MediaWIki has a strict limit on bad logins. If all of WMCS 
will be NATed, that would mean that /any/ bot having too many bad login attempts 
could block all other bots from logging in. Is that prevented through technical 
measures, somehow?




Hi,

do you know where this limit configuration can be found?

thanks for the heads up.

regards.

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] [Ops] Change to how Cloud VPS and Toolforge contact Wikis

2021-01-29 Thread Arturo Borrero Gonzalez

On 1/29/21 10:29 AM, Amir Sarabadani wrote:
This is sorta (under-)documented in 
https://www.mediawiki.org/wiki/Manual:$wgRateLimits 
<https://www.mediawiki.org/wiki/Manual:$wgRateLimits>


I made a patch for it but I'm not sure if I did it correctly.



Excellent, thanks!

Could you please share a link to gerrit so I can have such patch in my radar?

regards.
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] Change to how Cloud VPS and Toolforge contact Wikis

2021-02-02 Thread Arturo Borrero Gonzalez

On 1/25/21 11:55 AM, Arturo Borrero Gonzalez wrote:

Hello,

we are planning to change how Cloud VPS instances and Toolforge tools contact 
WMF-hosted wikis, in particular the source IP address for the network connection.

The new IP address that wikis will see is 185.15.56.1.

The change is scheduled to go live on 2021-02-08.

More detailed information in wikitech:

  https://wikitech.wikimedia.org/wiki/News/CloudVPS_NAT_wikis



Hi there,

based on the feedback we have collected so far, we decided to extend the 
timeline. This change won't go live on 2021-02-08 but at a later date instead.
We will use this extended timeline to review a few unexpected config changes 
that we need to introduce previous to this operation.


The exact new date is still to be decided, and we will share it once it is 
known.

Thanks to everyone for providing valuable feedback.

regards.

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] Network change for VMs contacting NFS dumps

2021-02-23 Thread Arturo Borrero Gonzalez

Hello,

today 2021-02-23 in about ~30 minutes (16:00 UTC) we will change how virtual 
machine instances running in Cloud VPS contact NFS dump servers [0].


There is no action required on your side.

We anticipate little to no impact as a result of the network changes. But in 
case you notice something is not properly working with dumps NFS in Cloud VPS 
(or Toolforge) please contact us [1] as soon as possible. The relevant 
phabricator ticket [2] is T272397.


regards.

[0] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Dumps
[1] 
https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_Introduction#Communication_and_support

[2] https://phabricator.wikimedia.org/T272397

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] [Cloud-announce] cloud-vps maintenance at 14:00 UTC

2021-04-28 Thread Arturo Borrero Gonzalez

On 4/27/21 5:34 PM, Roy Smith wrote:
I'm getting timeouts and 502's on both https://spi-tools.toolforge.org/ 
<https://spi-tools.toolforge.org/> and https://spi-tools-dev.toolforge.org/ 
<https://spi-tools-dev.toolforge.org/>.  Also:


ssh: connect to host dev.tools.wmflabs.org <http://dev.tools.wmflabs.org> port 
22: Network is unreachable




On a related note, I suggest you switch to using 'dev.toolforge.org' and 
'login.toolforge.org' for your SSH connections.


Stuff in the old tools.wmflabs.org domain may stop working at some point in the 
future as we deprecate such domain.


regards.

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] CloudVPS / Toolforge edge network maintenance 2021-05-06 @ 15:00 UTC

2021-05-03 Thread Arturo Borrero Gonzalez

Hello there,

We will be doing an upgrade to the CloudVPS edge network Thursday 2021-05-06 @ 
15:00 UTC that will likely impact user experience, including Toolforge.


We scheduled an 1h operation window. During that time, intermittent network 
interruption, packet loss and other network problems are to be expected.


The edge network maintenance will affect how virtual machines (and Toolforge 
tools) contact NFS, wiki-replicas, wikis API endpoints, and, in general, any 
network traffic that flows leaving or entering the cloud (also known as 
north-south traffic).


More information on the operation can be found in phabricator [0] and in 
wikitech [1].


Regards.

[0]https://phabricator.wikimedia.org/T270704
[1] 
https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/2020_Network_refresh

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] Expired certificates for wmflabs.org and wmcloud.org

2021-05-06 Thread Arturo Borrero Gonzalez

On 5/6/21 11:20 AM, Sebastian Berlin wrote:
I'm getting errors regarding expired certificates for wmflabs.org 
<http://wmflabs.org> and wmcloud.org <http://wmcloud.org>, e.g. 
https://wikispeech.wmflabs.org <https://wikispeech.wmflabs.org> and 
https://codesearch.wmcloud.org <https://codesearch.wmcloud.org>. Is this related 
to the maintenance later today or has something gone wrong? Here's an example 
from curl:




Good catch!

The certificate expired because acme-chief failed to renew them. Apparently it 
is a known bug.


I just force-restarted acme-chief and everything worked.

regards.
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] CloudVPS / Toolforge edge network maintenance 2021-05-06 @ 15:00 UTC

2021-05-06 Thread Arturo Borrero Gonzalez

On 5/3/21 11:27 AM, Arturo Borrero Gonzalez wrote:

Hello there,

We will be doing an upgrade to the CloudVPS edge network Thursday 2021-05-06 @ 
15:00 UTC that will likely impact user experience, including Toolforge.


We scheduled an 1h operation window. During that time, intermittent network 
interruption, packet loss and other network problems are to be expected.


The edge network maintenance will affect how virtual machines (and Toolforge 
tools) contact NFS, wiki-replicas, wikis API endpoints, and, in general, any 
network traffic that flows leaving or entering the cloud (also known as 
north-south traffic).


More information on the operation can be found in phabricator [0] and in 
wikitech [1].


Regards.

[0]https://phabricator.wikimedia.org/T270704
[1] 
https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/2020_Network_refresh 




Reminder, this is happening now!

See you on the other side :-)

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] CloudVPS / Toolforge edge network maintenance 2021-05-06 @ 15:00 UTC

2021-05-06 Thread Arturo Borrero Gonzalez

On 5/6/21 5:00 PM, Arturo Borrero Gonzalez wrote:


Reminder, this is happening now!

See you on the other side :-)



Hello from the other side.

This is now done. Sorry for the bumpy ride in Toolforge bastions.

regards

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


Re: [Cloud] Expired certificates for wmflabs.org and wmcloud.org

2021-05-07 Thread Arturo Borrero Gonzalez

On 5/7/21 7:40 AM, Sascha Brawer wrote:
Curious, does the Wikimedia cloud have some kind of monitoring system that could 
have noticed and send an alert?




Yeah, we have monitoring. We could always do better with monitoring in general, 
of course.



regards.
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation

___
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


[Cloud] [Cloud-announce] wiki replicas maintenance on 2021-07-22

2021-07-19 Thread Arturo Borrero Gonzalez

Hi there,

on Thurs July 22nd at 15:00 UTC (08:00 PDT / 11:00 EDT / 17:00 CEST) there is a 
planned network maintenance that will affect the availability of the wiki 
replica database service.


The expected operation window is of about 5 minutes long and it will affect any 
wiki replicas users including Toolforge tools, PAWS, and any other Cloud VPS 
project using them.


More information can be found on phabricator: 
https://phabricator.wikimedia.org/T286614


regards.

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Cloud-announce] 2021-11-02: Cloud VPS network outage

2021-11-02 Thread Arturo Borrero Gonzalez

Hi,

Today 2021-11-02 we had a severe network outage on Cloud VPS and Toolforge.

Several network connections were affected from 11:40 UTC to 13:20 UTC (1h40m 
duration). As of this writing the problem has been corrected.


Detailed information can be seen in Phabricator:

  https://phabricator.wikimedia.org/T294853

Sorry for the inconvenience.

regards.

--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Re: preferred proxy

2022-01-26 Thread Arturo Borrero Gonzalez



On 1/26/22 17:30, Taavi Väänänen wrote:

On 1/26/22 18:26, Tim Moody wrote:
Which is the preferred domain for dns proxies wmcloud.org 
<http://wmcloud.org> or wmflabs.org <http://wmflabs.org>?


Hi, for new services wmcloud.org is preferred.



Hi,

thanks Taavi for the clarification. It is true, wmcloud.org is the 
current domain and wmflabs.org is considered 'legacy' and in the [slow] 
process of being removed.


Hey @Tim, can you point to documentation or some information that needs 
updates that could be source of confusion in the future?


thanks, regards.
--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Re: [Cloud-announce] [IMPORTANT] Announcing Toolforge Debian Stretch Grid Engine deprecation

2022-02-16 Thread Arturo Borrero Gonzalez

On 2/15/22 21:46, Maarten Dammers wrote:

Hi,

Why are we upgrading to Buster instead of Bullseye? According to 
https://wikitech.wikimedia.org/wiki/Operating_system_upgrade_policy 
Buster will be end of life around August this year.
So we're either stuck with an older version for a while or we have to do 
this whole exercise again much sooner than we would like. Can you explain?




Hi there,

Legit question. I'm happy to elaborate:

* this was all discussed back in September 2021 in phabricator, see 
https://phabricator.wikimedia.org/T277653#7378774 and 
https://phabricator.wikimedia.org/T277653#7381146. Our conclusion was to 
don't skip Buster.


* we are hoping that there wont be a Buster->Bullseye migration for the 
grid. Hopefully by the time we need to remove Buster the Kubernetes 
backend will be 100% suitable solution for every tool.


* this migration work started before Debian Bullseye was released, with 
our intention being to complete it before the release. For a couple of 
reasons the project was delayed.


* in the grid case, the engineering effort to do a N+1 upgrade is lower 
than doing a N+2 upgrade. If we had tried a N+2 upgrade directly, things 
would have been much slower and difficult for us.


Your concern about doing the migration dance twice is 100% valid, and 
the only way to future-proof your tool is to remove dependency on 
GridEngine and migrate it to the Kubernetes backend.


regards.
--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Re: [Cloud-announce] [IMPORTANT] Announcing Toolforge Debian Stretch Grid Engine deprecation

2022-02-18 Thread Arturo Borrero Gonzalez



On 2/16/22 17:34, Russell Blau wrote:

Also, it is not possible to load Pywikibot in the tf-python39 runtime because a 
required module (requests, fromhttps://python-requests.org) is not available. 
What is the process for requesting (no pun intended) that this (or any other 
resource) be added to the image?


See some documentation here:

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Kubernetes_python_jobs

I just created it, and may need some polishing, but it should work!

We will review pywikibot specific workflows and documents soon.


regards.
--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Re: Kubernetes-based jobs engine -- how to use Python virtual environments?

2022-04-04 Thread Arturo Borrero Gonzalez




On 4/2/22 14:53, Martin Urbanec wrote:

Hello,

I just received a dozen emails about grid engine migration. I tried to 
migrate my personal tool (tool.martin-urbanec) first. This tool 
currently generates a Jupyter-notebook based report daily.


I do that by calling jupyter nbconvert --to html --execute 
community_configuration_usage.ipynb from a virtual environment where 
Jupyter is installed, together with a couple of other Python modules.


I managed to create new virtual environment that works from the new 
Buster bastion, and it works when executed directly from the bastion, 
but I can't get it to execute via the k8s-based engine:


Your problem may be related to bootstrapping the venv. See if this 
information can help you:


https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Kubernetes_python_jobs

This is very similar to what JMC89 replied in the other email.

--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Toolforge] some updates on toolforge-jobs command line interface

2022-04-04 Thread Arturo Borrero Gonzalez

Hi there,

wanted to share a few small updates for the toolforge-jobs command line 
interface. The changes are being deployed right now.


1) listing jobs now shows less columns, use --long to show all columns.
2) the `containers` action has been renamed to `images`. A compatibilty 
period will exists, and you will see a warning if you use `containers`.

3) when listing images, the table header no longer mentions "Docker".

These changes should be mostly cosmetic, and no functional or behavioral 
change is expected.


Please report any problems you may find.

regards.

--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Network operations today 2022-04-06

2022-04-06 Thread Arturo Borrero Gonzalez

Hi there,

Today 2022-04-06 we're performing some network maintenance operations on 
Cloud VPS that could affect all cloud egress/ingress traffic, including 
Toolforge. The cuts, if noticeable, should last a few minutes at most.


Some operations were also conducted yesterday (without this email 
notice), and some unexpected hiccups occurred. That's why the email today.


regards.
--
Arturo Borrero Gonzalez
Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Cloud-announce] Network maintenance

2022-10-06 Thread Arturo Borrero Gonzalez

Hi there,

We are currently working on replacing older hardware servers with newer 
ones, in particular those dedicated to cloud networking [0].


We have discovered a few shortcomings related mostly to network 
interface naming in the newer servers, and the latest openstack version 
behaving differently to what it used to be, and also some base operating 
system (debian) bugs [1]. Some of these are hardware-dependant and 
difficult to reproduce/anticipate in our staging environment.


The result is that we are having a more challenging and noisy migration 
than we would like. We already had a few (brief) network outages trying 
to introduce the new servers into service.


We'll try to keep things as stable as possible in the next few days 
until the migration is completed, but we can't discard having some more 
(brief) network outages until we are safely on the other side of the 
transition.


I'll send another note when we finish this network maintenance is over.

regards.

[0] https://phabricator.wikimedia.org/T316284
[1] https://bugs.debian.org/989162
--
Arturo Borrero Gonzalez
Senior Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Cloud-announce] Re: Network maintenance

2022-10-06 Thread Arturo Borrero Gonzalez




On 10/6/22 12:04, Arturo Borrero Gonzalez wrote:

Hi there,

We are currently working on replacing older hardware servers with newer 
ones, in particular those dedicated to cloud networking [0].


We have discovered a few shortcomings related mostly to network 
interface naming in the newer servers, and the latest openstack version 
behaving differently to what it used to be, and also some base operating 
system (debian) bugs [1]. Some of these are hardware-dependant and 
difficult to reproduce/anticipate in our staging environment.


The result is that we are having a more challenging and noisy migration 
than we would like. We already had a few (brief) network outages trying 
to introduce the new servers into service.


We'll try to keep things as stable as possible in the next few days 
until the migration is completed, but we can't discard having some more 
(brief) network outages until we are safely on the other side of the 
transition.


I'll send another note when we finish this network maintenance is over.



Hi there,

this has been completed.

Should you see any network problem starting now, consider it unexpected 
and I invite you to report it.


regards.

--
Arturo Borrero Gonzalez
Senior Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Some Cloud VPS virtual machines briefly unavailable today (and rebooted)

2022-11-22 Thread Arturo Borrero Gonzalez

Hi there,

Today 2022-11-22 at about 12:25 UTC, as part of a routine operation I 
reimaged/reformated a cloudvirt hypervisor without relocating all the 
virtual machines first.


The data survived the reimage, but the 32 (!) affected virtual machines 
were briefly unavailable and then hard-rebooted.


All virtual machines are now ACTIVE (up and running) from the openstack 
point of view, but please, let me know if you need assistance recovering 
them in any way.


As of this writing we don't have any automation to ensure we only 
reimage empty hypervisors, but we're working on it, to prevent this kind 
of human errors in the future.


regards. (and sorry!)

(!) Affected virtual machines are:

- ID: 78782628-4f9f-4263-84fc-06e767b3bfe1
  Name: mx-wiki
- ID: 1fa9f0d9-42e8-4273-bdb1-a7d49998c13f
  Name: synapse01
- ID: 2382fda0-e683-4d0c-95b6-bbbf323904d9
  Name: canary1048-04
- ID: 4b570277-e51f-459d-bea2-394c5ad7bc92
  Name: tools-sgeexec-10-16
- ID: 66529c1b-f3a3-4ff8-b30d-785f4f274965
  Name: feature-store-test
- ID: e153f69a-46a0-458a-ab50-de3d86aa861b
  Name: toolsbeta-test-k8s-worker-7
- ID: c3a2d1a9-f811-4da9-afba-3a113c8ff729
  Name: wbregistry-02
- ID: 2b56c575-08a5-4def-87cb-bee5bd43e4f9
  Name: prod
- ID: 141ac13c-f0fa-46d3-9d2a-cede8bc854c6
  Name: devtools-puppetdb1001
- ID: fdb15c24-0b41-42d6-9c4a-82afd1d9dcb9
  Name: tools-sgeweblight-10-31
- ID: 56e55a31-8d32-455e-b650-b7194e71d2fd
  Name: runner-1023
- ID: cb4a87e4-264e-4c8f-8197-3efff54346de
  Name: runner-1022
- ID: 5b6b5733-565d-456e-a4fc-85ce669d3fd2
  Name: deployment-mdb02
- ID: 75dce76d-36ad-4f9e-85e9-8a11ff6744db
  Name: wikibase-product-testing-2022
- ID: 868d3dca-3e5c-4089-89a9-2c7e756c3e31
  Name: toolsbeta-cumin-1
- ID: 42ac6d8a-453a-4620-b4b7-9c97994c98fb
  Name: integration-agent-docker-1030
- ID: 084da652-503d-49a7-9ffa-98a0cd5335fd
  Name: toolsbeta-sgeexec-10-5
- ID: f098fe82-18b6-49a9-962d-9b8f1f989b14
  Name: pcc-worker1001
- ID: 8eb272dc-8006-4e93-a966-5035809324d9
  Name: deployment-mx03
- ID: e67d0e4c-e07c-4d9a-8ddb-cb0bc8efa388
  Name: deployment-docker-api-gateway01
- ID: b958511a-10cb-4e62-bdbb-6da5013dd62f
  Name: soweego
- ID: 62045cf9-59ed-44b9-a268-1c9f171b5aae
  Name: tools-package-builder-04
- ID: 0127e905-f52e-4ed4-b60d-260102a8e625
  Name: pontoon-lb-02
- ID: 827bf744-262a-458b-951d-f2e9a377e075
  Name: toolsbeta-test-k8s-ingress-3
- ID: 3e6c31d7-b4db-4a5f-a610-a74d0013f631
  Name: pki-test01
- ID: 8893ba32-fb5c-4567-a242-b6c676978b7d
  Name: deployment-urldownloader03
- ID: f72e5b18-6376-4ccd-9e59-64447759e53f
  Name: deployment-deploy03
- ID: 006dea0a-a1eb-4de3-bf45-1a071ad87152
  Name: kafka-test-cloud-2
- ID: e05220d7-8ca1-4d9f-a933-01a843286ea8
  Name: toolsbeta-docker-imagebuilder-01
- ID: 416f445a-cad4-45c2-b32e-f17100f93eac
  Name: cloud-puppetmaster-05
- ID: 4e492051-25a3-4442-b8b9-1959f42917fe
  Name: tools-k8s-worker-76
- ID: df18863a-2da7-4951-aa69-936b3d889592
  Name: deployment-docker-cpjobqueue01

--
Arturo Borrero Gonzalez
Senior Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Puppet error emails on 2022-11-28

2022-11-29 Thread Arturo Borrero Gonzalez

Hi there!

On 2022-11-28 and 2022-11-29 there has been some misleading emails being 
sent: you may have receive one (or more) emails about puppet failures on 
your Cloud VPS virtual machine.


Moreover, such emails were a bit contradictory, with messages like
"No failed resources", and "No exceptions happened".

There was a problem in the way the puppet errors were calculated that 
has been now fixed [0].


This does not affect Toolforge.

sorry for the noise,

regards.

[0] https://gerrit.wikimedia.org/r/c/operations/puppet/+/861805/
--
Arturo Borrero Gonzalez
Senior Site Reliability Engineer
Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Toolforge jobs: briefly maintenance today 2023-01-10 @ 11:30 UTC

2023-01-10 Thread Arturo Borrero Gonzalez

Hi there,

the Toolforge jobs service [0] (the one you would use via the `toolforge-jobs` 
command line interface) will have a brief maintenance today 2023-01-10 @ 11:30 
UTC (in about 15 minutes).


We need to restart the API service and it will be down for a couple of minutes 
(perhaps even less).


During that time, using the toolforge-jobs command line interface will most 
likely fail.


regards.

[0] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] New toolforge-jobs features

2023-01-24 Thread Arturo Borrero Gonzalez

Hi there,

The Toolforge jobs framework just got upgraded with a few new features:

* support for custom logs
* support for job failure retry policy
* new behavior with job image listing
* some initial validation of YAML files

The documentation should be mostly up-to-date in wikitech:
 https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework

You can stop reading here unless you want more details :-)

The custom log files feature will allow you do things like:
* using a custom directory to store log files
* merging stdout/stderr logs together into a single file
* ignoring one of the two log streams

The job retry policy allows to instruct the computing engine to restart jobs 
that failed, up to 5 times.


Job images are now listed in a different format, and deprecated images are 
hidden by default, to encourage usage of newer ones.


Regarding the YAML validation, the toolforge-jobs utility will now emit a 
warning if some key is unknown. We plan to make this more robust in the future, 
also providing a schema file.


We don't usually announce upgrades, but this one in particular contained much 
awaited features. This is the result of hard work by several folks, in 
particular Taavi (community member) and Raymond (WMF contractor).


Happy `toolforging`. Regards.
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] Toolforge: brief network maintenance today 2023-03-06

2023-03-06 Thread Arturo Borrero Gonzalez

Hi there!

Today 2023-03-06, in a few minutes, we will restart the Toolforge internal 
network, A brief interruption of network communications is expected during the 
maintenance.


This is because we're re-deploying calico to the kubernetes cluster [0].

No action required on your side.

regards.

[0] https://phabricator.wikimedia.org/T328539
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Cloud-announce] Re: Toolforge Kubernetes upgrade on 2023-04-03 (new date: 2023-04-10)

2023-03-30 Thread Arturo Borrero Gonzalez

On 3/28/23 00:13, Taavi Väänänen wrote:

Hi,

We will be upgrading the Toolforge Kubernetes cluster next Monday (2023-04-03) 
starting at around 10:00 UTC.


The expected impact is that tools running on the Kubernetes cluster will get 
restarted a couple of times over the course of the few hours it takes for us to 
upgrade the entire cluster. The ability to manage tools will remain operational.


Since the version we're upgrading to (1.22) removes a bunch of deprecated 
Kubernetes APIs, tools that use kubectl and raw Kubernetes resources directly 
may want to check that they're on the latest available versions. The vast 
majority of tools that are only using the Jobs framework and/or the webservice 
command are not affected by these changes.




This has been rescheduled to Monday 2023-04-10 to leave room for the other 
operations we have.


regards.

--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


[Cloud] [Cloud-announce] Re: Toolforge Kubernetes upgrade on 2023-04-03 (new date: 2023-04-10)

2023-04-10 Thread Arturo Borrero Gonzalez

On 3/30/23 12:42, Arturo Borrero Gonzalez wrote:

On 3/28/23 00:13, Taavi Väänänen wrote:

Hi,

We will be upgrading the Toolforge Kubernetes cluster next Monday (2023-04-03) 
starting at around 10:00 UTC.


The expected impact is that tools running on the Kubernetes cluster will get 
restarted a couple of times over the course of the few hours it takes for us 
to upgrade the entire cluster. The ability to manage tools will remain 
operational.


Since the version we're upgrading to (1.22) removes a bunch of deprecated 
Kubernetes APIs, tools that use kubectl and raw Kubernetes resources directly 
may want to check that they're on the latest available versions. The vast 
majority of tools that are only using the Jobs framework and/or the webservice 
command are not affected by these changes.




This has been rescheduled to Monday 2023-04-10 to leave room for the other 
operations we have.




Hi there!

This is happening now!

https://phabricator.wikimedia.org/T286856

regards.

--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
___
Cloud-announce mailing list -- cloud-annou...@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
___
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


  1   2   >