I'm starting some batch maintenance of video transcodes so I'm exercising the new k8s-based maint script system on TMH's requeueTranscodes.php; good news: no surprises so far, everything's working just fine. :D
Since I'm running the same scripts over multiple wikis I went ahead and manually wrapped them in a bash for loop so it's submitting one job at a time out of all.dblist, using a screen session for the wrapper loop and tailing the logs to the session so they don't all smash out at once, and a second manually-started run for Commons. :) First-class support for running over a dblist will be a very welcome improvement, and should be pretty straightforward! Good work everybody. :D The longest job (Commons) might take a couple days to run, so we'll see if anything explodes later! hehe -- brooke On Wed, Sep 25, 2024 at 8:11 PM Reuven Lazarus <[email protected]> wrote: > Hi all, > > With MediaWiki at the WMF moving to Kubernetes, it's now time to start > running manual maintenance scripts there. Any time you would previously SSH > to a mwmaint host and run mwscript, follow these steps instead. The old way > will continue working for a little while, but it will be going away. > > > What's familiar: > > Starting a maintenance script looks like this: > > rzl@deploy2002:~$ mwscript-k8s --comment="T341553" -- Version.php > --wiki=enwiki > > Any options for the mwscript-k8s tool, as described below, go before the > --. > > After the --, the first argument is the script name; everything else is > passed to the script. This is the same as you're used to passing to > mwscript. > > > What's different: > > - Run mwscript-k8s on a deployment host, not the maintenance host. Either > deployment host will work; your job will automatically run in whichever > data center is active, so you no longer need to change hosts when there’s a > switchover. > > - You don't need a tmux. By default the tool launches your maintenance > script and exits immediately, without waiting for your job to finish. If > you log out of the deployment host, your job keeps running on the > Kubernetes cluster. > > - Kubernetes saves the maintenance script's output for seven days after > completion. By default, mwscript-k8s prints a kubectl command that you (or > anyone else) can paste and run to monitor the output or save it to a file. > > - As a convenience, you can pass -f (--follow) to mwscript-k8s to immediately > begin tailing the script output. If you like, you can do this inside a > tmux and keep the same workflow as before. Either way, you can safely > disconnect and your script will continue running on Kubernetes. > > rzl@deploy2002:~$ mwscript-k8s -f -- Version.php --wiki=testwiki > > [...] > > MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024) > > - For scripts that take input on stdin, you can pass --attach to > mwscript-k8s, either interactively or in a pipeline. > > rzl@deploy2002:~$ mwscript-k8s --attach -- shell.php --wiki=testwiki > > [...] > > Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman > > > $wmgRealm > > = "production" > > > > > rzl@deploy2002:~$ cat example_url.txt | mwscript-k8s --attach -- > purgeList.php > > [...] > > Purging 1 urls > > Done! > > - Your maintenance script runs in a Docker container which will not > outlive it, so it can't save persistent files to disk. Ensure your script > logs its important output to stdout, or persists it in a database or other > remote storage. > > - The --comment flag sets an optional (but encouraged) descriptive label, > such as a task number. > > - Using standard kubectl commands[1][2], you can check the status, and > view the output, of your running jobs or anyone else's. (Example: `kube_env > mw-script codfw; kubectl get pod -l username=rzl`) > > [1]: https://wikitech.wikimedia.org/wiki/Kubernetes/Kubectl > > [2]: https://kubernetes.io/docs/reference/kubectl/quick-reference/ > > > What's not supported yet: > > - Maintenance scripts launched automatically on a timer. We're working on > migrating them -- for now, this is for one-off scripts launched by hand. > > - If your job is interrupted (e.g. by hardware problems), Kubernetes can > automatically move it to another machine and restart it, babysitting it > until it completes. But we only want to do that if your job is safe to > restart. So by default, if your job is interrupted, it will stay stopped > until you restart it yourself. Soon, we'll add an option to declare "this > is idempotent, please restart it as needed" and that design is recommended > for new scripts. > > - No support yet for mwscriptwikiset, foreachwiki, foreachwikiindblist, > etc, but we'll add similar functionality as flags to mwscript_k8s. > > > Your feedback: > > Let me know by email or IRC, or on Phab (T341553 > <https://phabricator.wikimedia.org/T341553>). If mwscript-k8s doesn't > work for you, for now you can fall back to using the mwmaint hosts as > before -- but they will be going away. Please report any problems sooner > rather than later, so that we can ensure the new system meets your needs > before that happens. > > Thanks, > > Reuven, for Service Ops SRE > _______________________________________________ > Wikitech-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________ Wikitech-l mailing list -- [email protected] To unsubscribe send an email to [email protected] https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
