BTW. And we are even stuck a bit with hosted runner - we just secured some funds, but after closer inspection this is awfully dangerous to run self-hosted runners on GitHub and official documentation from GitHub says we should not do it:
https://docs.github.com/en/free-pro-team@latest/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories So we are a bit stuck now and honestly, I am not sure we have any viable option now. So some help from the Infra and guidance is I think necessary. J. On Tue, Oct 27, 2020 at 9:14 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > I tried to get this info from infrastructure but I honestly have no idea - > maybe someone from the INFRA team could let us know about the stats (that > was the case with Travis previously that we got some usage stats for all > projects). > > I do agree, that it is not sustainable at all. I'd love some clarity on > that. So far seems that there are a few projects that started using it > months ago and all of a sudden we've started to hit the limits. > > But I am afraid no one but the infra team can have any stats on it (maybe > even they do not have it). > > J. > > > On Tue, Oct 27, 2020 at 9:09 PM Chesnay Schepler <ches...@apache.org> > wrote: > >> How many projects are already using GitHub actions? >> >> It seems to be fairly new, and I find it concerning that we are already >> hitting the limit. If only few projects are using it currently, then it >> may be futile to rely on it because it would inevitably collapse if more >> projects were to use it. >> Unless there is some project using up most of the allocated minutes, >> similarly to what is(was?) happening with Travis. >> >> Alternatively, maybe GitHub actions should be reserved for quick checks >> and not actual CI pipelines. >> >> On 10/27/2020 8:53 PM, Jarek Potiuk wrote: >> > Hello everyone, >> > >> > The queues have become unbearable during the last two days. This is not >> > sustainable long-term. I lost hope a bit that any kind of optimization >> will >> > help but we are trying anyway. >> > >> > However, we are still trying :) >> > >> > We are just about to merge and verify the PR that implements this >> "limited >> > matrix tests before approval solution. We implemented it with Tobiasz >> who >> > volunteered to help and once it works we will try to apply it to Apache >> > Beam as well. When it works we will be happy to share the solution with >> > everyone. >> > >> > You can read more on how it works (with screenshot) here: >> > https://github.com/apache/airflow/pull/11828#issuecomment-717485938 >> > >> > We could not implement automated workflow run due to limitations of >> GitHub >> > Actions (you cannot rerun successful workflow via API) but we came up >> with >> > something even more flexible: >> > >> > 1) PRs before approval only run one default combination of matrix tests. >> > This in our case will save 50%-60% of build time for most PRs. >> > 2) Once PR gets approved, it gets "okay to test" label and comment in PR >> > "The PR is ready to run all tests! Please rebase it to latest master or >> ask >> > committer to re-run it". It also gets an "in-progress" check in the PR >> > which turns the green "merge" button into a gray one to avoid accidental >> > merges. But commiter can still decide to merge at this point (for small, >> > low-risk changes). >> > 3) Once the PR gets rebased or re-run it runs full-matrix tests and >> > everything follows as usual >> > 4) We also have a special treatment for the case that Allen mentioned >> > earlier - the "small" "doc-only" PRs have a special treatment, after >> > approval, they get immediately "okay to merge" label and "The PR is >> ready >> > to be merged. No tests are needed!." comment is added by the bot >> > >> > Again - once we find it working, I am happy to describe how to add it to >> > your GitHub actions and share such information with all other projects >> > using Github Actions. >> > >> > J. >> > >> > >> > On Fri, Oct 23, 2020 at 5:29 PM Jarek Potiuk <jarek.pot...@polidea.com> >> > wrote: >> > >> >> Started working on this mini-solution for limiting non-approved >> >> matrix builds. >> >> >> >> I am working on it with a colleague of mine - Tobiasz - who worked on >> >> Apache Beam infrastructure, so we might test it on two projects. >> >> >> >> I will let you know the progress >> >> >> >> Mini-design doc here: >> >> >> >> >> https://docs.google.com/document/d/16rwyCfyDpKWN-DrLYbhjU0B1D58T1RFYan5ltmw4DQg/edit# >> >> >> >> J. >> >> >> >> >> >> On Thu, Oct 22, 2020 at 10:03 PM Jarek Potiuk < >> jarek.pot...@polidea.com> >> >> wrote: >> >> >> >>> >> >>> I believe this problem cannot be really handled by one project, but I >> >>> have a proposal. >> >>> >> >>> I looked at the common pattern we have in the ASF projects and I think >> >>> there is a way that we can help each other. >> >>> >> >>> I think most of the problems come from many PRs submitted that run a >> >>> matrix of tests before even commiters have time to take a look at >> them. We >> >>> discussed how we can approach it and I think I have a proposal that >> we can >> >>> all adopt in the ASF projects. Something that will be easy to >> implement and >> >>> will not impact the process we have. I would love to hear your >> thoughts >> >>> about it - before I start implementing it :). >> >>> >> >>> My proposal is to create a GitHub Action that will allow to run only a >> >>> subset of "matrix" test for PRs that are not yet approved by >> committers. >> >>> This should be possible using the current GitHub Actions workflows >> and API. >> >>> It boils down to: >> >>> * If PR is not approved, only a subset of matrix (default value for >> each >> >>> matrix component) are run >> >>> * the committers can see the "green" mark of test passing and make a >> >>> review >> >>> * once the PR gets approved, automatically a new "full matrix" check >> is >> >>> triggered >> >>> * all future approved PR pushes run the "full matrix" check >> >>> >> >>> I think that might significantly reduce the strain on GA jobs we run, >> and >> >>> it should very naturally fit in the typical PR workflow for ASF >> projects. >> >>> But I am only guessing now, so I would love to hear what you think: >> >>> >> >>> I am willing (together with my colleagues) to implement this action >> and >> >>> add it to Apache Airflow to check it. Together with the >> >>> "cancel-workflow-action" I developed and we deployed it at Apache >> Airflow >> >>> and Apache Beam, I think that might help to keep the CI "pressure" >> much >> >>> lower - independently if any of the projects manages to get their >> credit >> >>> sponsors. I think I can have a working Action/implementation done >> over the >> >>> weekend: >> >>> >> >>> More details about the proposal here: >> >>> >> https://lists.apache.org/thread.html/r6f6f1420aa6346c9f81bf9d9fff8816e860e49224eb02e25d856c249%40%3Cdev.airflow.apache.org%3E >> >>> >> >>> J, >> >>> >> >>> On Mon, Oct 19, 2020 at 5:28 PM Jarek Potiuk < >> jarek.pot...@polidea.com> >> >>> wrote: >> >>> >> >>>> Yep. We still continuously optimize it and we are reaching out to get >> >>>> funding for self-hosted runners. And I think it would be great to >> see that >> >>>> happening. I am happy to help anyone who needs some help there - >> I've been >> >>>> already helping Apache Beam with their GitHub Actions settings. >> >>>> >> >>>> On Mon, Oct 19, 2020 at 6:12 AM Greg Stein <gst...@gmail.com> wrote: >> >>>> >> >>>>> This is some great news, Jarek. >> >>>>> >> >>>>> Given that GitHub build minutes are shared, we need more of this >> kind of >> >>>>> work from our many communities. >> >>>>> >> >>>>> Thanks, >> >>>>> Greg >> >>>>> InfraAdmin, ASF >> >>>>> >> >>>>> >> >>>>> On Sun, Oct 18, 2020 at 2:32 PM Jarek Potiuk < >> jarek.pot...@polidea.com> >> >>>>> wrote: >> >>>>> >> >>>>>> Hello Allen, >> >>>>>> >> >>>>>> I'd really love to give a try to Yetus - how it can actually make >> our >> >>>>>> approach better. >> >>>>>> >> >>>>>> I just merged the change I planned (finally we got to that), that >> >>>>>> implements the final optimisation that you mentioned. In the case >> of a >> >>>>>> single .md file change we got the build time down to about 1 >> minute, >> >>>>> most >> >>>>>> of it being GitHub Actions "workflow" overhead. >> >>>>>> >> >>>>>> We went-down with the incremental pre-commit tests to ~ 25s. >> >>>>>> >> >>>>>> Build here: https://github.com/potiuk/airflow/pull/128/checks. As >> >>>>> you can >> >>>>>> see here: >> >>>>>> >> >>>>>> >> >>>>> >> https://github.com/potiuk/airflow/pull/128/checks?check_run_id=1268353637#step:7:98 >> >>>>>> in >> >>>>>> this case we run only the relevant static checks: >> >>>>>> >> >>>>>> - "No-tabs checker" >> >>>>>> - "Add license for all md files" >> >>>>>> - "Add TOC for md files." >> >>>>>> - "Check for merge conflicts" >> >>>>>> - "Detect Private Key" >> >>>>>> - "Fix End of Files" >> >>>>>> - "Trim Trailing Whitespace" >> >>>>>> - "Check for language that we do not accept as community", >> >>>>>> >> >>>>>> All the other checks, image building, and all the extra checks are >> >>>>> skipped >> >>>>>> (automatically as pre-commit determined them irrelevant). >> >>>>>> >> >>>>>> All this, while we keep really comprehensive tests and >> optimisation of >> >>>>>> image building for all the "serious steps". I tried to explain the >> >>>>>> philosophy and some basic assumptions behind our CI in >> >>>>>> >> https://github.com/apache/airflow/blob/master/CI.rst#ci-environment >> >>>>> - and >> >>>>>> I'd love to try to see how this plays together with the Yetus tool. >> >>>>>> >> >>>>>> Would it be possible to work together with the Yetus team on trying >> >>>>> to see >> >>>>>> how it can help to further optimise and possibly simplify the >> setup we >> >>>>>> have? I'd love to get some cooperation on those. I am nearly done >> >>>>> with all >> >>>>>> optimisations I planned, And we are for years (long before my >> tenure) >> >>>>> among >> >>>>>> top-3 Apache projects when it comes to CI-time use, so that might >> be >> >>>>> a good >> >>>>>> one if we can pull together some improvements. >> >>>>>> >> >>>>>> >> >>>>>> J. >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Wed, Oct 14, 2020 at 4:41 PM Jarek Potiuk < >> >>>>> jarek.pot...@polidea.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Exactly - > dialectic vs. dislectic for example. >> >>>>>>> >> >>>>>>> On Wed, Oct 14, 2020 at 4:40 PM Jarek Potiuk < >> >>>>> jarek.pot...@polidea.com> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> And really sorry about yatus vs. yetus - I am slightly dialectic >> >>>>> and >> >>>>>> when >> >>>>>>>> things are not in the dictionary, I tend to do many mistakes. I >> >>>>> hope >> >>>>>> it's >> >>>>>>>> not something that people can take as a sign of being "worse", >> but >> >>>>> if >> >>>>>> you >> >>>>>>>> felt offended by that - apologies. >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Wed, Oct 14, 2020 at 4:34 PM Jarek Potiuk < >> >>>>> jarek.pot...@polidea.com> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Hey Allen, >> >>>>>>>>> >> >>>>>>>>> I would be super happy if you could help us to do it properly at >> >>>>>> Airlfow >> >>>>>>>>> - would you like to work with us and get the yatus configuration >> >>>>> that >> >>>>>>>>> would work for us ? I am super happy to try it? Maybe you could >> >>>>> open PR >> >>>>>>>>> with some basic yatus implementation to start with and we could >> >>>>> work >> >>>>>>>>> together to get it simplified? I would love to learn how to do >> it. >> >>>>>>>>> >> >>>>>>>>> J >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Wed, Oct 14, 2020 at 3:37 PM Allen Wittenauer >> >>>>>>>>> <a...@effectivemachines.com.invalid> wrote: >> >>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>> On Oct 13, 2020, at 11:04 PM, Jarek Potiuk < >> >>>>>> jarek.pot...@polidea.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> This is a logic >> >>>>>>>>>>> that we have to implement regardless - whether we use yatus or >> >>>>>>>>>> pre-commit >> >>>>>>>>>>> (please correct me if I am wrong). >> >>>>>>>>>> I'm not sure about yatus, but for yetus, for the most >> >>>>> part, >> >>>>>>>>>> yes, one would like to need to implement custom rules in the >> >>>>>> personality to >> >>>>>>>>>> exactly duplicate the overly complicated and over engineered >> >>>>> airflow >> >>>>>>>>>> setup. The big difference is that one wouldn't be starting >> from >> >>>>>> scratch. >> >>>>>>>>>> The difference engine is already there. The file filter is >> >>>>> already >> >>>>>> there. >> >>>>>>>>>> full build vs. PR handling is already there. etc etc etc >> >>>>>>>>>> >> >>>>>>>>>>> For all others, this is not a big issue because in total all >> >>>>> other >> >>>>>>>>>>> pre-commits take 2-3 minutes at best. And if we find that we >> >>>>> need to >> >>>>>>>>>>> optimize it further we can simply disable the '--all-files' >> >>>>> switch >> >>>>>> for >> >>>>>>>>>>> pre-commit and they will only run on the latest commit-changed >> >>>>> files >> >>>>>>>>>>> (pre-commit will only run the tests related to those changed >> >>>>> files). >> >>>>>>>>>> But >> >>>>>>>>>>> since they are pretty fast (except pylint/mypy/flake8) we >> think >> >>>>>>>>>> running >> >>>>>>>>>>> them all, for now, is not a problem. >> >>>>>>>>>> That's what everyone thinks until they start >> aggregating >> >>>>> the >> >>>>>>>>>> time across all changes... >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> >> >>>>>>>>> Jarek Potiuk >> >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software >> Engineer >> >>>>>>>>> >> >>>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>> -- >> >>>>>>>> >> >>>>>>>> Jarek Potiuk >> >>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>>>>>> >> >>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>> >> >>>>>>>> >> >>>>>>> -- >> >>>>>>> >> >>>>>>> Jarek Potiuk >> >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>>>>> >> >>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>> >> >>>>>>> >> >>>>>> -- >> >>>>>> >> >>>>>> Jarek Potiuk >> >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>>>> >> >>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>> >> >>>> >> >>>> -- >> >>>> >> >>>> Jarek Potiuk >> >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>> >> >>>> M: +48 660 796 129 <+48660796129> >> >>>> [image: Polidea] <https://www.polidea.com/> >> >>>> >> >>>> >> >>> -- >> >>> >> >>> Jarek Potiuk >> >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>> >> >>> M: +48 660 796 129 <+48660796129> >> >>> [image: Polidea] <https://www.polidea.com/> >> >>> >> >>> >> >> -- >> >> >> >> Jarek Potiuk >> >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> >> >> M: +48 660 796 129 <+48660796129> >> >> [image: Polidea] <https://www.polidea.com/> >> >> >> >> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>