Hi,
On 12/6/23 03:59, Magnus Manske via Cloud wrote:
Hi all,
I do appreciate the efforts to keep toolforge running, and that
sometimes massive changes are necessary to do this, which has
implications for tool maintainers.
+1, I am not sure people fully appreciate how massive of a change this
is, the grid engine is one of the few remaining parts in Toolforge that
actually predates it. River announced[1] SGE support for the Toolserver
back in September 2009 and then in January 2013 everyone was given
exactly one month(!!) to move their bots to SGE[2].
So it's a real milestone on the infrastructure side and maintainers to
make it this far in getting rid of it, but it also means there's 10+
years of user familiarity, expectations and inertia towards the grid.
K8s, as it's run right now on toolforge, can not
- ...
- has very limited per-tool resources, and the webservice reduces those
even further
Just FYI if you weren't aware, the default quotas were recently raised
to 8CPU + 8GB total, with a max of 3CPU + 6GB per pod. (This is also
something I ran into.)
- Even the current Wikitech documentation still uses grid engine, eg
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust
<https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust> (I have tried,
and failed, to get that running on k8s)
Sorry, this was on me, I had been pinged a while back to update it but
it took me a while to figure it out for my own tools and I was still
tweaking my own setup. I've updated it now, so the main "Rust" wiki page
now explains how to use the jobs framework, but I still need to update
the "My first Rust tool" guide. It's kind of cumbersome because k8s
doesn't spawn a login shell so we have to do it manually but I guess
that's for the best? If there's any other Rust-related stuff I can help
with, please let me know.
Anyways, I feel in a similar boat overall, I've mostly spent the last
two weeks just taking stock of my tools and rewriting some and shutting
down others. I found it useful to spend a bit of time homogenizing how
my tools are laid out so I could just write an ansible playbook[3] to
deploy all of them (I plan to explain this in a forthcoming blog post,
very soon now) in a similar fashion and apply multi-tool changes easily too.
I've already asked for the February extension for at least one of my
tools, I think it's pretty reasonable for you to ask as well. I am not
sure how long the lifeline can last though, the Debian Buster LTS
end-of-life is coming up in June 2024 and I'm sure there's other
considerations too.
[1]
https://lists.wikimedia.org/hyperkitty/list/toolserve...@lists.wikimedia.org/message/6MIGIEJ6K3OM27CGYSVWYP3JITFTRFGG/
[2]
https://lists.wikimedia.org/hyperkitty/list/toolserve...@lists.wikimedia.org/message/LFVUNVVLOLX364HKWV7IDIB4KXGGZOOU/
[3] https://gitlab.wikimedia.org/legoktm/toolforge-ansible
-- Legoktm
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/