Hi,

On 12/6/23 03:59, Magnus Manske via Cloud wrote:
Hi all,

I do appreciate the efforts to keep toolforge running, and that sometimes massive changes are necessary to do this, which has implications for tool maintainers.

+1, I am not sure people fully appreciate how massive of a change this is, the grid engine is one of the few remaining parts in Toolforge that actually predates it. River announced[1] SGE support for the Toolserver back in September 2009 and then in January 2013 everyone was given exactly one month(!!) to move their bots to SGE[2].

So it's a real milestone on the infrastructure side and maintainers to make it this far in getting rid of it, but it also means there's 10+ years of user familiarity, expectations and inertia towards the grid.

K8s, as it's run right now on toolforge, can not
- ...
- has very limited per-tool resources, and the webservice reduces those even further

Just FYI if you weren't aware, the default quotas were recently raised to 8CPU + 8GB total, with a max of 3CPU + 6GB per pod. (This is also something I ran into.)

- Even the current Wikitech documentation still uses grid engine, eg https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust <https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust> (I have tried, and failed, to get that running on k8s)

Sorry, this was on me, I had been pinged a while back to update it but it took me a while to figure it out for my own tools and I was still tweaking my own setup. I've updated it now, so the main "Rust" wiki page now explains how to use the jobs framework, but I still need to update the "My first Rust tool" guide. It's kind of cumbersome because k8s doesn't spawn a login shell so we have to do it manually but I guess that's for the best? If there's any other Rust-related stuff I can help with, please let me know.

Anyways, I feel in a similar boat overall, I've mostly spent the last two weeks just taking stock of my tools and rewriting some and shutting down others. I found it useful to spend a bit of time homogenizing how my tools are laid out so I could just write an ansible playbook[3] to deploy all of them (I plan to explain this in a forthcoming blog post, very soon now) in a similar fashion and apply multi-tool changes easily too.

I've already asked for the February extension for at least one of my tools, I think it's pretty reasonable for you to ask as well. I am not sure how long the lifeline can last though, the Debian Buster LTS end-of-life is coming up in June 2024 and I'm sure there's other considerations too.

[1] https://lists.wikimedia.org/hyperkitty/list/toolserve...@lists.wikimedia.org/message/6MIGIEJ6K3OM27CGYSVWYP3JITFTRFGG/ [2] https://lists.wikimedia.org/hyperkitty/list/toolserve...@lists.wikimedia.org/message/LFVUNVVLOLX364HKWV7IDIB4KXGGZOOU/
[3] https://gitlab.wikimedia.org/legoktm/toolforge-ansible

-- Legoktm
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to