Re: GitHub Actions Concurrency Limits for Apache projects

Jarek Potiuk Tue, 27 Oct 2020 13:14:50 -0700

I tried to get this info from infrastructure but I honestly have no idea -
maybe someone from the INFRA team could let us know about the stats (that
was the case with Travis previously that we got some usage stats for all
projects).

I do agree, that it is not sustainable at all. I'd love some clarity on
that. So far seems that there are a few projects that started using it
months ago and all of a sudden we've started to hit the limits.

But I am afraid no one but the infra team can have any stats on it (maybe
even they do not have it).

J.


On Tue, Oct 27, 2020 at 9:09 PM Chesnay Schepler <ches...@apache.org> wrote:

> How many projects are already using GitHub actions?
>
> It seems to be fairly new, and I find it concerning that we are already
> hitting the limit. If only few projects are using it currently, then it
> may be futile to rely on it because it would inevitably collapse if more
> projects were to use it.
> Unless there is some project using up most of the allocated minutes,
> similarly to what is(was?) happening with Travis.
>
> Alternatively, maybe GitHub actions should be reserved for quick checks
> and not actual CI pipelines.
>
> On 10/27/2020 8:53 PM, Jarek Potiuk wrote:
> > Hello everyone,
> >
> > The queues have become unbearable during the last two days. This is not
> > sustainable long-term. I lost hope a bit that any kind of optimization
> will
> > help but we are trying anyway.
> >
> > However, we are still trying :)
> >
> > We are just about to merge and verify the PR that implements this
> "limited
> > matrix tests before approval solution. We implemented it with Tobiasz who
> > volunteered to help and once it works we will try to apply it to Apache
> > Beam as well. When it works we will be happy to share the solution with
> > everyone.
> >
> > You can read more on how it works (with screenshot) here:
> > https://github.com/apache/airflow/pull/11828#issuecomment-717485938
> >
> > We could not implement automated workflow run due to limitations of
> GitHub
> > Actions (you cannot rerun successful workflow via API) but we came up
> with
> > something even more flexible:
> >
> > 1) PRs before approval only run one default combination of matrix tests.
> > This in our case will save 50%-60% of build time for most PRs.
> > 2) Once PR gets approved, it gets "okay to test" label and comment in PR
> > "The PR is ready to run all tests! Please rebase it to latest master or
> ask
> > committer to re-run it". It also gets an "in-progress" check in the PR
> > which turns the green "merge" button into a gray one to avoid accidental
> > merges. But commiter can still decide to merge at this point (for small,
> > low-risk changes).
> > 3) Once the PR gets rebased or re-run it runs full-matrix tests and
> > everything follows as usual
> > 4) We also have a special treatment for the case that Allen mentioned
> > earlier - the "small" "doc-only" PRs have a special treatment, after
> > approval, they get immediately "okay to merge" label and "The PR is ready
> > to be merged. No tests are needed!."  comment is added by the bot
> >
> > Again - once we find it working, I am happy to describe how to add it to
> > your GitHub actions and share such information with all other projects
> > using Github Actions.
> >
> > J.
> >
> >
> > On Fri, Oct 23, 2020 at 5:29 PM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> >
> >> Started working on this mini-solution for limiting non-approved
> >> matrix builds.
> >>
> >> I am working on it with a colleague of mine -  Tobiasz - who worked on
> >> Apache Beam infrastructure, so we might test it on two projects.
> >>
> >> I will let you know the progress
> >>
> >> Mini-design doc here:
> >>
> >>
> https://docs.google.com/document/d/16rwyCfyDpKWN-DrLYbhjU0B1D58T1RFYan5ltmw4DQg/edit#
> >>
> >> J.
> >>
> >>
> >> On Thu, Oct 22, 2020 at 10:03 PM Jarek Potiuk <jarek.pot...@polidea.com
> >
> >> wrote:
> >>
> >>>
> >>> I believe this problem cannot be really handled by one project, but I
> >>> have a proposal.
> >>>
> >>> I looked at the common pattern we have in the ASF projects and I think
> >>> there is a way that we can help each other.
> >>>
> >>> I think most of the problems come from many PRs submitted that run a
> >>> matrix of tests before even commiters have time to take a look at
> them. We
> >>> discussed how we can approach it and I think I have a proposal that we
> can
> >>> all adopt in the ASF projects. Something that will be easy to
> implement and
> >>> will not impact the process we have. I would love to hear your thoughts
> >>> about it - before I start implementing it :).
> >>>
> >>> My proposal is to create a GitHub Action that will allow to run only a
> >>> subset of "matrix" test for PRs that are not yet approved by
> committers.
> >>> This should be possible using the current GitHub Actions workflows and
> API.
> >>> It boils down to:
> >>> * If PR is not approved, only a subset of matrix (default value for
> each
> >>> matrix component) are run
> >>> * the committers can see the "green" mark of test passing and make a
> >>> review
> >>> * once the PR gets approved, automatically a new "full matrix" check is
> >>> triggered
> >>> * all future approved PR pushes run the "full matrix" check
> >>>
> >>> I think that might significantly reduce the strain on GA jobs we run,
> and
> >>> it should very naturally fit in the typical PR workflow for ASF
> projects.
> >>> But I am only guessing now, so I would love to hear what you think:
> >>>
> >>> I am willing (together with my colleagues) to implement this action and
> >>> add it to Apache Airflow to check it. Together with the
> >>> "cancel-workflow-action" I developed and we deployed it at Apache
> Airflow
> >>> and Apache Beam, I think that might help to keep the CI "pressure" much
> >>> lower - independently if any of the projects manages to get their
> credit
> >>> sponsors. I think I can have a working Action/implementation done over
> the
> >>> weekend:
> >>>
> >>> More details about the proposal here:
> >>>
> https://lists.apache.org/thread.html/r6f6f1420aa6346c9f81bf9d9fff8816e860e49224eb02e25d856c249%40%3Cdev.airflow.apache.org%3E
> >>>
> >>> J,
> >>>
> >>> On Mon, Oct 19, 2020 at 5:28 PM Jarek Potiuk <jarek.pot...@polidea.com
> >
> >>> wrote:
> >>>
> >>>> Yep. We still continuously optimize it and we are reaching out to get
> >>>> funding for self-hosted runners. And I think it would be great to see
> that
> >>>> happening. I am happy to help anyone who needs some help there - I've
> been
> >>>> already helping Apache Beam with their GitHub Actions settings.
> >>>>
> >>>> On Mon, Oct 19, 2020 at 6:12 AM Greg Stein <gst...@gmail.com> wrote:
> >>>>
> >>>>> This is some great news, Jarek.
> >>>>>
> >>>>> Given that GitHub build minutes are shared, we need more of this
> kind of
> >>>>> work from our many communities.
> >>>>>
> >>>>> Thanks,
> >>>>> Greg
> >>>>> InfraAdmin, ASF
> >>>>>
> >>>>>
> >>>>> On Sun, Oct 18, 2020 at 2:32 PM Jarek Potiuk <
> jarek.pot...@polidea.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello Allen,
> >>>>>>
> >>>>>> I'd really love to give a try to Yetus - how it can actually make
> our
> >>>>>> approach better.
> >>>>>>
> >>>>>> I just merged the change I planned (finally we got to that), that
> >>>>>> implements the final optimisation that you mentioned. In the case
> of a
> >>>>>> single .md file change we got the build time down to about 1 minute,
> >>>>> most
> >>>>>> of it being GitHub Actions "workflow" overhead.
> >>>>>>
> >>>>>> We went-down with the incremental pre-commit tests to ~ 25s.
> >>>>>>
> >>>>>> Build here: https://github.com/potiuk/airflow/pull/128/checks. As
> >>>>> you can
> >>>>>> see here:
> >>>>>>
> >>>>>>
> >>>>>
> https://github.com/potiuk/airflow/pull/128/checks?check_run_id=1268353637#step:7:98
> >>>>>> in
> >>>>>> this case we run only the relevant static checks:
> >>>>>>
> >>>>>>     - "No-tabs checker"
> >>>>>>     - "Add license for all md files"
> >>>>>>     - "Add TOC for md files."
> >>>>>>     - "Check for merge conflicts"
> >>>>>>     - "Detect Private Key"
> >>>>>>     - "Fix End of Files"
> >>>>>>     - "Trim Trailing Whitespace"
> >>>>>>     - "Check for language that we do not accept as community",
> >>>>>>
> >>>>>> All the other checks, image building, and all the extra checks are
> >>>>> skipped
> >>>>>> (automatically as pre-commit determined them irrelevant).
> >>>>>>
> >>>>>> All this, while we keep really comprehensive tests and optimisation
> of
> >>>>>> image building for all the "serious steps". I tried to explain the
> >>>>>> philosophy and some basic assumptions behind our CI in
> >>>>>> https://github.com/apache/airflow/blob/master/CI.rst#ci-environment
> >>>>> - and
> >>>>>> I'd love to try to see how this plays together with the Yetus tool.
> >>>>>>
> >>>>>> Would it be possible to work together with the Yetus team on trying
> >>>>> to see
> >>>>>> how it can help to further optimise and possibly simplify the setup
> we
> >>>>>> have? I'd love to get some cooperation on those. I am nearly done
> >>>>> with all
> >>>>>> optimisations I planned, And we are for years (long before my
> tenure)
> >>>>> among
> >>>>>> top-3 Apache projects when it comes to CI-time use, so that might be
> >>>>> a good
> >>>>>> one if we can pull together some improvements.
> >>>>>>
> >>>>>>
> >>>>>> J.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Oct 14, 2020 at 4:41 PM Jarek Potiuk <
> >>>>> jarek.pot...@polidea.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Exactly - > dialectic vs. dislectic for example.
> >>>>>>>
> >>>>>>> On Wed, Oct 14, 2020 at 4:40 PM Jarek Potiuk <
> >>>>> jarek.pot...@polidea.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> And really sorry about yatus vs. yetus - I am slightly dialectic
> >>>>> and
> >>>>>> when
> >>>>>>>> things are not in the dictionary, I tend to do many mistakes. I
> >>>>> hope
> >>>>>> it's
> >>>>>>>> not something that people can take as a sign of being "worse", but
> >>>>> if
> >>>>>> you
> >>>>>>>> felt offended by that - apologies.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Oct 14, 2020 at 4:34 PM Jarek Potiuk <
> >>>>> jarek.pot...@polidea.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hey Allen,
> >>>>>>>>>
> >>>>>>>>> I would be super happy if you could help us to do it properly at
> >>>>>> Airlfow
> >>>>>>>>> - would you like to work with us and get the yatus configuration
> >>>>> that
> >>>>>>>>> would work for us ? I am super happy to try it? Maybe you could
> >>>>> open PR
> >>>>>>>>> with some basic yatus implementation to start with and we could
> >>>>> work
> >>>>>>>>> together to get it simplified? I would love to learn how to do
> it.
> >>>>>>>>>
> >>>>>>>>> J
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Oct 14, 2020 at 3:37 PM Allen Wittenauer
> >>>>>>>>> <a...@effectivemachines.com.invalid> wrote:
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Oct 13, 2020, at 11:04 PM, Jarek Potiuk <
> >>>>>> jarek.pot...@polidea.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>> This is a logic
> >>>>>>>>>>> that we have to implement regardless - whether we use yatus or
> >>>>>>>>>> pre-commit
> >>>>>>>>>>> (please correct me if I am wrong).
> >>>>>>>>>>          I'm not sure about yatus, but for yetus, for the most
> >>>>> part,
> >>>>>>>>>> yes, one would like to need to implement custom rules in the
> >>>>>> personality to
> >>>>>>>>>> exactly duplicate the overly complicated and over engineered
> >>>>> airflow
> >>>>>>>>>> setup.  The big difference is that one wouldn't be starting from
> >>>>>> scratch.
> >>>>>>>>>> The difference engine is already there. The file filter is
> >>>>> already
> >>>>>> there.
> >>>>>>>>>> full build vs. PR handling is already there. etc etc etc
> >>>>>>>>>>
> >>>>>>>>>>> For all others, this is not a big issue because in total all
> >>>>> other
> >>>>>>>>>>> pre-commits take 2-3 minutes at best. And if we find that we
> >>>>> need to
> >>>>>>>>>>> optimize it further we can simply disable the '--all-files'
> >>>>> switch
> >>>>>> for
> >>>>>>>>>>> pre-commit and they will only run on the latest commit-changed
> >>>>> files
> >>>>>>>>>>> (pre-commit will only run the tests related to those changed
> >>>>> files).
> >>>>>>>>>> But
> >>>>>>>>>>> since they are pretty fast (except pylint/mypy/flake8) we think
> >>>>>>>>>> running
> >>>>>>>>>>> them all, for now, is not a problem.
> >>>>>>>>>>          That's what everyone thinks until they start
> aggregating
> >>>>> the
> >>>>>>>>>> time across all changes...
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> Jarek Potiuk
> >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>>>>
> >>>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Jarek Potiuk
> >>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>>>
> >>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>
> >>>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Jarek Potiuk
> >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>>
> >>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>
> >>>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Jarek Potiuk
> >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>
> >>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>
> >>>>
> >>>> --
> >>>>
> >>>> Jarek Potiuk
> >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>
> >>>> M: +48 660 796 129 <+48660796129>
> >>>> [image: Polidea] <https://www.polidea.com/>
> >>>>
> >>>>
> >>> --
> >>>
> >>> Jarek Potiuk
> >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>
> >>> M: +48 660 796 129 <+48660796129>
> >>> [image: Polidea] <https://www.polidea.com/>
> >>>
> >>>
> >> --
> >>
> >> Jarek Potiuk
> >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>
> >> M: +48 660 796 129 <+48660796129>
> >> [image: Polidea] <https://www.polidea.com/>
> >>
> >>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: GitHub Actions Concurrency Limits for Apache projects

Reply via email to