Hi Everyone, As usual, some great discussion is going on. I thought I'd let this thread continue for a while before joining in, but rest assured each and every email on this list gets read.
I emailed Github a few days ago asking a bunch of questions; and hope to get another builds@ meeting with Github as guests pretty soon. Will let you know how it goes On Fri, Apr 16, 2021 at 3:46 PM Hyukjin Kwon <gurwls...@gmail.com> wrote: > Yes, per project budget will be great. > > Beside cancellation workaround, Apache Spark is trying another workaround > for the time being: distributing workflow runs > to forked repositories to leverage contributor's GitHub Actions resources > instead of consuming all ASF organisation resources. > > Please see also how I implemented it if anyone is interested in this - the > PR description should be self-explanatory: > - https://github.com/apache/spark/pull/32092 > - https://github.com/apache/spark/pull/32193 > > Note that this is still a workaround and it disables GitHub Actions work > out of the box. > > > On Fri, 16 Apr 2021, 22:26 Matt Sicker, <boa...@gmail.com> wrote: > > > I thought one of the key points raised by Jarek before was that even with > > infinite compute resources, unoptimized actions will fill up all the > > compute (like how widening highways induces higher traffic demand). The > > per-PMC compute budgets sounds like a great way to help “shift left” on > the > > problem to avoid demanding Infra fix hundreds of scripts they know > nothing > > about. The other optimizations Jarek has demonstrated here like ensuring > > old commits get cancelled in favor of new commits, or more sophisticated > > things like only running tests that are relevant to the commit, can all > go > > a long way toward optimizing build compute. > > > > Now if you’re attempting to run entire integration and end to end test > > suites on every commit, I don’t think the earth has enough compute power > > for that quite yet! > > > > On Fri, Apr 16, 2021 at 07:44 Hyukjin Kwon <gurwls...@gmail.com> wrote: > > > > > I thought Jarek was pretty clear on that. I meant this: > > > > > > > So it all has to start with 'per-project' resource limitation and > self- > > > > budgeting. It would be GREAT if infra.could provide self-hosted > GitHub > > > > Runners SERVICE per project, where project could donate credits or > > money > > > > for their own account, then the projects would have incentive to > > optimize > > > > their own usage. I imagine this would be the best thing since the > > sliced > > > > bread that INFRA could provide to all the projects. > > > > > > Maintaining and providing a self-hosted runners in GitHub Actions where > > the > > > resources are managed in project level where each project can donate > > > credits. > > > > > > In addition, Jarek mentioned that Airflow already has a working > version - > > > is it correct Jarek? > > > > > > If the infra team takes and improves it for other ASF projects, that > > would > > > permanently resolve this issue. > > > > > > This suggestion looks reasonable and realistic to me. > > > > > > How do you think about this? > > > > > > > > > On Fri, 16 Apr 2021, 21:36 Martin Grigorov, <mgrigo...@apache.org> > > wrote: > > > > > > > Hi Hyukjin, > > > > > > > > On Fri, Apr 16, 2021 at 3:04 AM Hyukjin Kwon <gurwls...@gmail.com> > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > Is here the right place to expect feedback from the infra team or > > > related > > > > > people? > > > > > It would be great to hear what the infra team thinks about Jarek's > > > > > suggestion. > > > > > > > > > > > > > What suggestion exactly do you mean ? > > > > I've just re-read Jarek's email and I see 3 tasks for Github Actions > > > team, > > > > but nothing specific for Apache Infra team. > > > > > > > > > > > > > > > > > > > > > > > 2021년 4월 13일 (화) 오전 11:15, Hyukjin Kwon <gurwls...@gmail.com>님이 > 작성: > > > > > > > > > >> Hi all, > > > > >> > > > > >> Could we have any update and feedback from the INFRA team about > > > Jarek's > > > > >> suggestion please? > > > > >> > > > > >> 2021년 4월 9일 (금) 오전 7:06, Jarek Potiuk <ja...@potiuk.com>님이 작성: > > > > >> > > > > >>> > > > > >>>> That's a good idea. We do need to thank Github to give free > > > resources > > > > to > > > > >>>> ASF projects, but it's better if we can make it a business: we > > allow > > > > >>>> individual projects to sign deals with Github to get dedicated > > > > >>>> resources. > > > > >>>> It's a bit wasteful to ask every project to set up its own dev > > ops, > > > > >>>> using Github Action is more convenient. Maybe we should raise it > > to > > > > >>>> Github? > > > > >>>> > > > > >>> > > > > >>> I do not think you can get per-project resources in GH - the most > > you > > > > >>> can do are self-hosted runners for your project. > > > > >>> > > > > >>> (BTW I am not from the INFRA team - just a humble "CI person" of > > > Apache > > > > >>> Airflow but very much vested into Github Actions) > > > > >>> maybe the infra team can chime in here. We did raise it to > GitHub, > > we > > > > >>> even had meeting with them > > > > >>> organized by Gavin and several topics were raised that could be > > > > >>> eventually addressed by Github: > > > > >>> > > > > >>> - observability (they could not give us per-project usage > > dashboard - > > > > we > > > > >>> built our own imperfect (with API limitations) one by Tobiasz > from > > > > Airllow > > > > >>> - security (limiting access to only project committers) - this we > > > > >>> handled by the Ash's fork of Runner (but it's also imperfect - > even > > > > today I > > > > >>> had to fix a problem where we had list of committers > desynchronised > > > > between > > > > >>> our infra/CI.yml) > > > > >>> - manageability (assigning resources per-project) - this works by > > > > having > > > > >>> self-hosted runners assigned per project (we needed infra JIRA > > ticket > > > > and > > > > >>> generation of a bunch of tokens for our runners and our own AWS > > > account > > > > >>> with auto-scaling). > > > > >>> > > > > >>> It would be indeed great if it could be available from GitHub, > but > > so > > > > >>> far we do not have any of those. > > > > >>> > > > > >>> J. > > > > >>> > > > > >>> > > > > >>> > > > > >>>> On Wed, Apr 7, 2021 at 9:31 PM Hyukjin Kwon < > gurwls...@gmail.com> > > > > >>>> wrote: > > > > >>>> > > > > >>>> > Thanks Martin for your feedback. > > > > >>>> > > > > > >>>> > > What was your reason to migrate from Apache Jenkins to > Github > > > > >>>> Actions ? > > > > >>>> > > > > > >>>> > I am sure there were more reasons for migrating from Amplap > > > Jenkins > > > > >>>> > <https://amplab.cs.berkeley.edu/jenkins/> to GitHub Actions > but > > > as > > > > >>>> far as > > > > >>>> > I can remember: > > > > >>>> > - To reduce the maintenance cost of machines > > > > >>>> > - The Jenkins machines became unstable and slow causing CI > jobs > > to > > > > >>>> fail or > > > > >>>> > be very flaky. > > > > >>>> > - Difficulty to manage the installed libraries. > > > > >>>> > - Intermittent unknown issues in the machines > > > > >>>> > > > > > >>>> > Yes, one option might be to consider other options to migrate > > > again. > > > > >>>> > However, other projects will very likely suffer the > > > > >>>> > same problem. In addition, the migration in a large project is > > not > > > > an > > > > >>>> > easy work to do > > > > >>>> > > > > > >>>> > I would like to know the feasibility of having more resources > in > > > > >>>> GitHub > > > > >>>> > Actions, or, for example, having sub-groups where > > > > >>>> > each group shares the resources - currently one GitHub > > > organisation > > > > >>>> shares > > > > >>>> > all resources across the projects. > > > > >>>> > > > > > >>>> > > > > > >>>> > 2021년 4월 7일 (수) 오후 10:04, Martin Grigorov < > mgrigo...@apache.org > > > >님이 > > > > >>>> 작성: > > > > >>>> > > > > > >>>> >> > > > > >>>> >> > > > > >>>> >> On Wed, Apr 7, 2021 at 3:41 PM Hyukjin Kwon < > > gurwls...@gmail.com > > > > > > > > >>>> wrote: > > > > >>>> >> > > > > >>>> >>> Hi Greg, > > > > >>>> >>> > > > > >>>> >>> I raised this thread to figure out a way that we can work > > > together > > > > >>>> to > > > > >>>> >>> resolve this issue, gather feedback, and to understand how > > other > > > > >>>> projects > > > > >>>> >>> work around. > > > > >>>> >>> Several projects I observed, as far as I can tell, have made > > > > enough > > > > >>>> >>> efforts > > > > >>>> >>> to save the resources in GitHub Actions but still suffer > from > > > the > > > > >>>> lack of > > > > >>>> >>> resources. > > > > >>>> >>> > > > > >>>> >> > > > > >>>> >> And it will get even worse because: > > > > >>>> >> 1) more and more Apache projects migrate from TravisCI to > > Github > > > > >>>> Actions > > > > >>>> >> (GA) > > > > >>>> >> 2) new projects join ASF and many of them already use GA > > > > >>>> >> > > > > >>>> >> > > > > >>>> >> What was your reason to migrate from Apache Jenkins to Github > > > > >>>> Actions ? > > > > >>>> >> If you want dedicated resources then you will need to manage > > the > > > CI > > > > >>>> >> yourself. > > > > >>>> >> You could use Apache Jenkins/Buildbot with dedicated agents > for > > > > your > > > > >>>> >> project. > > > > >>>> >> Or you could set up your own CI infrastructure with Jenkins, > > > > DroneIO, > > > > >>>> >> ConcourceCI, ... > > > > >>>> >> > > > > >>>> >> Yet another option is to move to CircleCI or Cirrus. They are > > > > >>>> similar to > > > > >>>> >> TravisCI / GA and less crowded (for now). > > > > >>>> >> > > > > >>>> >> Martin > > > > >>>> >> > > > > >>>> >> I appreciate the resources provided to us but that does not > > > resolve > > > > >>>> the > > > > >>>> >>> issue of the development being slowed down. > > > > >>>> >>> > > > > >>>> >>> > > > > >>>> >>> 2021년 4월 7일 (수) 오후 5:52, Greg Stein <gst...@gmail.com>님이 > 작성: > > > > >>>> >>> > > > > >>>> >>> > On Wed, Apr 7, 2021 at 12:25 AM Hyukjin Kwon < > > > > gurwls...@gmail.com > > > > >>>> > > > > > >>>> >>> wrote: > > > > >>>> >>> > > > > > >>>> >>> >> Hi all, > > > > >>>> >>> >> > > > > >>>> >>> >> I am an Apache Spark PMC, > > > > >>>> >>> > > > > > >>>> >>> > > > > > >>>> >>> > You are a member of the Apache Spark PMC. You are *not* a > > PMC. > > > > >>>> Please > > > > >>>> >>> stop > > > > >>>> >>> > with that terminology. The Foundation has about 200 PMCs, > > and > > > > you > > > > >>>> are a > > > > >>>> >>> > member of one of them. You are NOT a "PMC" .. you're a > > > person. A > > > > >>>> PMC > > > > >>>> >>> is a > > > > >>>> >>> > construct of the Foundation. > > > > >>>> >>> > > > > > >>>> >>> > >... > > > > >>>> >>> > > > > > >>>> >>> >> I am aware of the limited GitHub Actions resources that > are > > > > >>>> shared > > > > >>>> >>> >> across all projects in ASF, > > > > >>>> >>> >> and many projects suffer from it. This issue > significantly > > > > slows > > > > >>>> down > > > > >>>> >>> the > > > > >>>> >>> >> development cycle of > > > > >>>> >>> >> other projects, at least Apache Spark. > > > > >>>> >>> >> > > > > >>>> >>> > > > > > >>>> >>> > And the Foundation gets those build minutes for GitHub > > Actions > > > > >>>> >>> provided to > > > > >>>> >>> > us from GitHub and Microsoft, and we are thankful that > they > > > > >>>> provide > > > > >>>> >>> them to > > > > >>>> >>> > the Foundation. Maybe it isn't all the build minutes that > > > every > > > > >>>> group > > > > >>>> >>> > wants, but that is what we have. So it is incumbent upon > all > > > of > > > > >>>> us to > > > > >>>> >>> > figure out how to build more, with fewer minutes. > > > > >>>> >>> > > > > > >>>> >>> > Say "thank you" to GitHub, please. > > > > >>>> >>> > > > > > >>>> >>> > Regards, > > > > >>>> >>> > -g > > > > >>>> >>> > > > > > >>>> >>> > > > > > >>>> >>> > > > > >>>> >> > > > > >>>> > > > > >>> > > > > >>> > > > > >>> -- > > > > >>> +48 660 796 129 > > > > >>> > > > > >> > > > > > > > > > > -- *Gavin McDonald* Systems Administrator ASF Infrastructure Team