I ask this because of these failures.  Where does cyberbot-db-01 live?  The 
data on there is critical.

Cyberpower678
English Wikipedia Account Creation Team
English Wikipedia Administrator
Global User Renamer

-----Original Message-----
From: Cloud <cloud-boun...@lists.wikimedia.org> On Behalf Of Andrew Bogott
Sent: Wednesday, February 13, 2019 14:51
To: cloud-annou...@lists.wikimedia.org
Subject: Re: [Cloud] [Cloud-announce] VPS hardware failure -- things are even 
worse!

Now cloudvirt1024 is dying in earnest, so VMs hosted there will be down for a 
while as well.  This is, as far as anyone can tell, just a stupid coincidence.

So far it appears that we are going to be able to rescue /most/ things without 
significant data loss. For now, though, there's going to be plenty more 
downtime.

VMs on cloudvirt1024 are:

| 8113d2c5-6788-43f6-beeb-123b0b717af3 | drmf-beta                     |
math
| 169b3260-4f7e-43dc-94c2-e699308a3426 | ecmabot                       |
webperf
| 29e875e3-15d5-4f74-9716-c0025c2ea098 | encoding02                    |
video
| 1b2b8b50-d463-4b7f-a3a9-6363eeb3ca8b | encoding03                    |
video
| 5421f938-7a11-499c-bc6a-534da1f4e27d | hafnium                       |
rcm
| 041d42b9-df36-4176-9f5d-a508989bbebc | hound-app-01                  |
hound
| 6149375b-8a08-4f03-882a-6fc0f5f77499 | integration-slave-docker-1044 |
integration
| 4d64b032-d93a-4a8c-a7e5-569c17e5063f | integration-slave-docker-1046 |
integration
| ad48959a-9eb9-46a9-bec4-a2bf23cdf655 | integration-slave-docker-1047 |
integration
| 21644632-0972-448f-83d0-b76f9d1d28e0 | ldfclient-new                 |
wikidata-query
| c2a30fe0-2c87-4b01-be53-8e2a3d0f40a7 | math-docker                   |
math
| df8f17fb-03fe-4725-b9cf-3d9fe76f4654 | mediawiki2latex               |
collection-alt-renderer
| d73f36e6-7534-4910-9a6e-64a6b9088d1e | neon                          |
rcm
| 2d035965-ba53-41b3-b6ef-d2ebbe50656a | novaadminmadethis             |
quotatest
| c84f61c0-4fd2-47a5-b6ab-dd6b5ea98d41 | ores-puppetmaster-01          |
ores
| 585bb328-8078-4437-b076-9e555683e27d | ores-sentinel-01              |
ores
| 0538bfed-d7b5-4751-9431-8feecbaf78c0 | oxygen                        |
rcm
| e8090d9e-7529-46a9-b1e1-c4ba523a2898 | packaging                     |
thumbor
| c7fe4663-7f2b-4d23-a79b-1a2e01c80d93 | twlight-prod                  |
twl
| 2370b38f-7a65-4ccf-a635-7a2fa5e12b3e | twlight-staging               |
twl
| 464577c6-86f0-42f9-9c49-86f9ec9a0210 | twlight-tracker               |
twl
| 5325322d-a57e-4a9b-85b7-37643f03bfea | wikidata-misc                 |
wikidata-dev





On 2/13/19 11:23 AM, Andrew Bogott wrote:
> Here's the latest:
>
> cloudvirt1018 is up and running, and many of its VMs are fine. Many 
> other VMs are corrupted and won't start up.  Some of those VMs will 
> probably be lost for good, but we're still investigating rescue options.
>
> In the meantime, if your VM is up and you can access it then you're in 
> luck!  If not, stay tuned.
>
> -Andrew
>
>
>
> On 2/13/19 9:15 AM, Andrew Bogott wrote:
>> I spoke too soon -- we're still working on this.  Most of these VMs 
>> will remain down in the meantime.
>>
>> Sorry for the outage!
>>
>> On 2/13/19 8:21 AM, Andrew Bogott wrote:
>>> We don't fully understand what happened, but after Giovanni 
>>> performed a classic "turning it off and on again" things are now 
>>> running without warnings.  The VMs listed below are now coming back 
>>> online and everything should be back up shortly.
>>>
>>> We'll probably replace some of this hardware anyway, out of an 
>>> abundance of caution, but that's unlikely to produce further 
>>> downtime.  With luck, this is the last you'll hear about this.
>>>
>>> -Andrew
>>>
>>>
>>> On 2/13/19 7:25 AM, Andrew Bogott wrote:
>>>> We're currently experiencing a mysterious hareware failure in our 
>>>> datacenter -- three different SSDs failed overnight, two of them in 
>>>> cloudvirt1018 and one of them in cloudvirt1024.  The VMs on 1018 
>>>> are down entirely.  We may move those on 1024 to another host 
>>>> shortly in order to guard against additional drive failure.
>>>>
>>>> There's some possibility that we will experience permanent data 
>>>> loss on cloudvirt1018, but everyone is working hard to avoid this.
>>>>
>>>> The following VMs are on cloudvirt1018:
>>>>
>>>>
>>>> a11y                             | reading-web-staging
>>>> abogott-scapserver               | testlabs
>>>> af-puppetdb01                    | automation-framework
>>>> api                              | openocr
>>>> asdf                             | quotatest
>>>> bastion-eqiad1-02                | bastion
>>>> clm-test-01                      | community-labs-monitoring
>>>> compiler1002                     | puppet-diffs
>>>> cyberbot-exec-iabot-01           | cyberbot
>>>> deployment-db03                  | deployment-prep
>>>> deployment-db04                  | deployment-prep
>>>> deployment-memc05                | deployment-prep
>>>> deployment-pdfrender02           | deployment-prep
>>>> deployment-sca01                 | deployment-prep
>>>> design-lsg3                      | design
>>>> eventmetrics-dev01               | eventmetrics
>>>> fridolin                         | catgraph
>>>> gtirloni-puppetmaster-01         | testlabs
>>>> hadoop-master-3                  | analytics
>>>> ign                              | ign2commons
>>>> integration-castor03             | integration
>>>> integration-slave-docker-1017    | integration
>>>> integration-slave-docker-1033    | integration
>>>> integration-slave-docker-1038    | integration
>>>> integration-slave-jessie-1003    | integration
>>>> integration-slave-jessie-android | integration
>>>> k8s-master-01                    | general-k8s
>>>> k8s-node-03                      | general-k8s
>>>> k8s-node-05                      | general-k8s
>>>> k8s-node-06                      | general-k8s
>>>> kdc                              | analytics
>>>> labstash-jessie1                 | logging
>>>> language-mleb-legacy             | language
>>>> login-test                       | catgraph
>>>> lsg-01                           | design
>>>> mathosphere                      | math
>>>> mc-clusterA-1                    | test-twemproxy
>>>> mwoffliner5                      | mwoffliner
>>>> novaadminmadethis-4              | quotatest
>>>> ntp-01                           | cloudinfra
>>>> ntp-02                           | cloudinfra
>>>> ogvjs-testing                    | ogvjs-integration
>>>> phragile-pro                     | phragile
>>>> planet-hotdog                    | planet
>>>> pub2                             | wikiapiary
>>>> puppenmeister                    | planet
>>>> puppet-compiler-v4-other         | testlabs
>>>> puppet-compiler-v4-tools         | testlabs
>>>> quarry-beta-01                   | quarry
>>>> signwriting-swis                 | signwriting
>>>> signwriting-swserver             | signwriting
>>>> social-tools3                    | social-tools
>>>> striker-deploy04                 | striker
>>>> striker-puppet01                 | striker
>>>> t166878                          | otrs
>>>> togetherjs                       | visualeditor
>>>> tools-sgebastion-06              | tools
>>>> tools-sgeexec-0902               | tools
>>>> tools-sgeexec-0903               | tools
>>>> tools-sgewebgrid-generic-0901    | tools
>>>> tools-sgewebgrid-lighttpd-0901   | tools
>>>> ve-font                          | design
>>>> wikibase1                        | sciencesource
>>>> wikicitevis-prod                 | wikicitevis
>>>> wikifarm                         | pluggableauth
>>>> women-in-red                     | globaleducation
>>>>
>>>>
>>>
>>
>


_______________________________________________
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud


_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to