3 months later, i have some updates! TLDR1: we're shutting jenkins down at the end of 2021. > > this is still the goal, exact shutdown date TBD.
> long term (until EOY): > * decide what the future of spark builds and releases will look like > - do we need jenkins? > - if we do, who's responsible for hosting + ops? > this looks like github actions + some as-of-yet-tbd k8s solution for integration tests. > medium term (in 6 months): > * prepare jenkins worker ansible configs and stick in the spark repo > this is done: https://github.com/apache/spark/tree/master/dev/ansible-for-test-node > * train up brian shiratsuki (cced) to help w/ops tasks and upgrades over > the next ~6m > this is ongoing, and we now have reasonable monitoring! > * get to all of the python version, library installation, etc etc jira > requests > > i think i've knocked out most of these. > short term(weeks): > * bring up additional workers > - finish hardware/system level repairs on the bare metal > - see above, re k8s jira > * stabilize cluster > - recent jenkins LTS upgrade broke the web GUI > - finish deploying monitoring/alerting > - this hardware is OLD and literally falling over, so we have lots of > random disk and ram failures. it's literally whack-a-mole and each trip to > the colo to repair literally takes a full day > > we're generally doing alright w/all of these: the hardware has been pretty stable, the jenkins administrative GUI is still broken (but at least i can hack the xml on the bare metal), and we've got 8 workers up and running. i'll be sending out another email to this list soon regarding the impending jenkins 'freeze'. shane -- Shane Knapp Computer Guy / Voice of Reason UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu