I will let Hussein (if he has time) to share some more details :). Generally speaking we are using Github Actions as CI - so what we **really** need is auto-scaling k8S cluster where K8S Controller is deployd and connected (via ASF infrastructure's Github APP) https://github.com/actions/actions-runner-controller. The last state we had - as far as I remember - Hussein already had a (Terraform?) deployment for it and it generally was depending on the ASF/ Infra authorisation / setup. Then some fine-tuning / labels (small/medium/big instances) to define/findalize and extend it to be able to also run ARM instances.
J. On Mon, Aug 12, 2024 at 1:10 AM Neil <neil4r...@gmail.com> wrote: > I have solid AWS and EKS knowledge, I'd offer my help if my skills are > applicable. > Which Infrastructure as Code and CI/CD frameworks are being utilized for > the testing Terraform Cloudformation? > I've had good experiences with Pulumi python. > Have you considered using EFS to handle the disk space needs? > > On Sun, Aug 11, 2024 at 6:18 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > Hello here, > > > > It would be great to have someone (or better two people) to get engaged > in > > our test infrastructure work - this will improve everyone's experience. > I > > **REALLY** think we should have other people that have engaged so far, so > > that we can decrease the bus factor we have for our infrastructure. > > > > Just after I was away for 5 days and without too much connectivity our > main > > was broken (lack of disk space for constraints generation) and some mypy > > checks were failing for the last few days. > > > > This is unsustainable and we need to find people who will know and be > able > > to fix this infrastructure. > > > > *Early warning* - I am planning 3 weeks holidays after Airflow Summit - > and > > I won't be looking at my email/github during those days, which means that > > whoever will be working on Airflow 3 might be severely impacted by some > of > > those failures. > > > > Just to remind - until we have the k8S controller set up on our AWS > > account and connected to our repo - we won't be able to use the credits > > that we got recently. So this is a good start. > > > > I created a high-level issue for that > > https://github.com/apache/airflow/issues/41388 and it waits for some > > volunteers to pick it up. It's a very important thing to do - we can > speed > > up many parts of our builds (for example release preparation - but also > > likely most of our tests) up to 4 times, which means that a lot of time > can > > be saved for waiting. > > > > Kaxil - I propose we should add a point at the next devcall - and keep it > > as an unresolved Airflow 3 issue until it is well, unresolved. > > > > J. > > >