[Wikitech-l] Re: Maintenance scripts are moving to Kubernetes

Brooke Vibber Tue, 08 Oct 2024 16:37:33 -0700

I'm starting some batch maintenance of video transcodes so I'm exercising
the new k8s-based maint script system on TMH's requeueTranscodes.php; good
news: no surprises so far, everything's working just fine. :D


Since I'm running the same scripts over multiple wikis I went ahead and
manually wrapped them in a bash for loop so it's submitting one job at a
time out of all.dblist, using a screen session for the wrapper loop and
tailing the logs to the session so they don't all smash out at once, and a
second manually-started run for Commons. :)

First-class support for running over a dblist will be a very welcome
improvement, and should be pretty straightforward! Good work everybody. :D

The longest job (Commons) might take a couple days to run, so we'll see if
anything explodes later! hehe

-- brooke

On Wed, Sep 25, 2024 at 8:11 PM Reuven Lazarus <[email protected]>
wrote:

> Hi all,
>
> With MediaWiki at the WMF moving to Kubernetes, it's now time to start
> running manual maintenance scripts there. Any time you would previously SSH
> to a mwmaint host and run mwscript, follow these steps instead. The old way
> will continue working for a little while, but it will be going away.
>
>
> What's familiar:
>
> Starting a maintenance script looks like this:
>
>   rzl@deploy2002:~$ mwscript-k8s --comment="T341553" -- Version.php
> --wiki=enwiki
>
> Any options for the mwscript-k8s tool, as described below, go before the
> --.
>
> After the --, the first argument is the script name; everything else is
> passed to the script. This is the same as you're used to passing to
> mwscript.
>
>
> What's different:
>
> - Run mwscript-k8s on a deployment host, not the maintenance host. Either
> deployment host will work; your job will automatically run in whichever
> data center is active, so you no longer need to change hosts when there’s a
> switchover.
>
> - You don't need a tmux. By default the tool launches your maintenance
> script and exits immediately, without waiting for your job to finish. If
> you log out of the deployment host, your job keeps running on the
> Kubernetes cluster.
>
> - Kubernetes saves the maintenance script's output for seven days after
> completion. By default, mwscript-k8s prints a kubectl command that you (or
> anyone else) can paste and run to monitor the output or save it to a file.
>
> - As a convenience, you can pass -f (--follow) to mwscript-k8s to immediately
> begin tailing the script output. If you like, you can do this inside a
> tmux and keep the same workflow as before. Either way, you can safely
> disconnect and your script will continue running on Kubernetes.
>
>   rzl@deploy2002:~$ mwscript-k8s -f -- Version.php --wiki=testwiki
>
>   [...]
>
>   MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024)
>
> - For scripts that take input on stdin, you can pass --attach to
> mwscript-k8s, either interactively or in a pipeline.
>
>   rzl@deploy2002:~$ mwscript-k8s --attach -- shell.php --wiki=testwiki
>
>   [...]
>
>   Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman
>
>   > $wmgRealm
>
>   = "production"
>
>   >
>
>   rzl@deploy2002:~$ cat example_url.txt | mwscript-k8s --attach --
> purgeList.php
>
>   [...]
>
>   Purging 1 urls
>
>   Done!
>
> - Your maintenance script runs in a Docker container which will not
> outlive it, so it can't save persistent files to disk. Ensure your script
> logs its important output to stdout, or persists it in a database or other
> remote storage.
>
> - The --comment flag sets an optional (but encouraged) descriptive label,
> such as a task number.
>
> - Using standard kubectl commands[1][2], you can check the status, and
> view the output, of your running jobs or anyone else's. (Example: `kube_env
> mw-script codfw; kubectl get pod -l username=rzl`)
>
> [1]: https://wikitech.wikimedia.org/wiki/Kubernetes/Kubectl
>
> [2]: https://kubernetes.io/docs/reference/kubectl/quick-reference/
>
>
> What's not supported yet:
>
> - Maintenance scripts launched automatically on a timer. We're working on
> migrating them -- for now, this is for one-off scripts launched by hand.
>
> - If your job is interrupted (e.g. by hardware problems), Kubernetes can
> automatically move it to another machine and restart it, babysitting it
> until it completes. But we only want to do that if your job is safe to
> restart. So by default, if your job is interrupted, it will stay stopped
> until you restart it yourself. Soon, we'll add an option to declare "this
> is idempotent, please restart it as needed" and that design is recommended
> for new scripts.
>
> - No support yet for mwscriptwikiset, foreachwiki, foreachwikiindblist,
> etc, but we'll add similar functionality as flags to mwscript_k8s.
>
>
> Your feedback:
>
> Let me know by email or IRC, or on Phab (T341553
> <https://phabricator.wikimedia.org/T341553>). If mwscript-k8s doesn't
> work for you, for now you can fall back to using the mwmaint hosts as
> before -- but they will be going away. Please report any problems sooner
> rather than later, so that we can ensure the new system meets your needs
> before that happens.
>
> Thanks,
>
> Reuven, for Service Ops SRE
> _______________________________________________
> Wikitech-l mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

_______________________________________________
Wikitech-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Maintenance scripts are moving to Kubernetes

Reply via email to