Added as agenda item for the next dev call (22 Aug) On Mon, 12 Aug 2024 at 00:25, Jarek Potiuk <ja...@potiuk.com> wrote:
> I will let Hussein (if he has time) to share some more details :). > > Generally speaking we are using Github Actions as CI - so what we > **really** need is auto-scaling k8S cluster where K8S Controller is deployd > and connected (via ASF infrastructure's Github APP) > https://github.com/actions/actions-runner-controller. The last state we > had > - as far as I remember - Hussein already had a (Terraform?) deployment for > it and it generally was depending on the ASF/ Infra authorisation / setup. > Then some fine-tuning / labels (small/medium/big instances) to > define/findalize and extend it to be able to also run ARM instances. > > J. > > On Mon, Aug 12, 2024 at 1:10 AM Neil <neil4r...@gmail.com> wrote: > > > I have solid AWS and EKS knowledge, I'd offer my help if my skills are > > applicable. > > Which Infrastructure as Code and CI/CD frameworks are being utilized for > > the testing Terraform Cloudformation? > > I've had good experiences with Pulumi python. > > Have you considered using EFS to handle the disk space needs? > > > > On Sun, Aug 11, 2024 at 6:18 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > > > Hello here, > > > > > > It would be great to have someone (or better two people) to get engaged > > in > > > our test infrastructure work - this will improve everyone's experience. > > I > > > **REALLY** think we should have other people that have engaged so far, > so > > > that we can decrease the bus factor we have for our infrastructure. > > > > > > Just after I was away for 5 days and without too much connectivity our > > main > > > was broken (lack of disk space for constraints generation) and some > mypy > > > checks were failing for the last few days. > > > > > > This is unsustainable and we need to find people who will know and be > > able > > > to fix this infrastructure. > > > > > > *Early warning* - I am planning 3 weeks holidays after Airflow Summit - > > and > > > I won't be looking at my email/github during those days, which means > that > > > whoever will be working on Airflow 3 might be severely impacted by some > > of > > > those failures. > > > > > > Just to remind - until we have the k8S controller set up on our AWS > > > account and connected to our repo - we won't be able to use the > credits > > > that we got recently. So this is a good start. > > > > > > I created a high-level issue for that > > > https://github.com/apache/airflow/issues/41388 and it waits for some > > > volunteers to pick it up. It's a very important thing to do - we can > > speed > > > up many parts of our builds (for example release preparation - but also > > > likely most of our tests) up to 4 times, which means that a lot of time > > can > > > be saved for waiting. > > > > > > Kaxil - I propose we should add a point at the next devcall - and keep > it > > > as an unresolved Airflow 3 issue until it is well, unresolved. > > > > > > J. > > > > > >