Merging the PR is blocked by https://github.com/apache/pulsar/issues/14951 .

Pulsar SQL doesn't work with Java 11.0.14.1 version. It fails with this error 
message:
Exception in thread "main" java.lang.IllegalArgumentException: Cannot parse 
version 11.0.14.1
        at io.prestosql.server.JavaVersion.parse(JavaVersion.java:76)
        at 
io.prestosql.server.PrestoSystemRequirements.verifyJavaVersion(PrestoSystemRequirements.java:102)
        at 
io.prestosql.server.PrestoSystemRequirements.verifyJvmRequirements(PrestoSystemRequirements.java:45)
        at io.prestosql.server.PrestoServer.run(PrestoServer.java:78)
        at io.prestosql.$gen.Presto_332____20220330_100314_1.run(Unknown Source)
        at io.prestosql.server.PrestoServer.main(PrestoServer.java:72)

I'll apply a workaround to unblock CI. 

-Lari

On 2022/03/30 06:52:38 Lari Hotari wrote:
> Thank you for the reviews and feedback. I have started making the switch to 
> the new refactored Pulsar CI.
> 
> Merging new PRs are blocked until the switch is ready. The reason for this is 
> that I have merged https://github.com/apache/pulsar/pull/14939 preparing for 
> merging https://github.com/apache/pulsar/pull/14819 .
> 
> The GitHub Actions "required checks" change in the refactored Pulsar CI and 
> there can be only one effective set of "required checks" for a branch. 
> 
> After the new Pulsar CI workflow PR has been merged, each in-progress PR has 
> to be closed & immediately reopened to pick up the new workflow and the PR 
> build has to run through the new workflow. Another way to pick up the new 
> workflow is to rebase the PR (or merge master branch changes to it).
> 
> Please let me know if you experience any issues with the new Pulsar CI 
> workflow. I'll be on the #testing channel on Pulsar Slack too.
> 
> -Lari
> 
> On 2022/03/29 15:43:31 Michael Marshall wrote:
> > Great work, Lari! It's great news that GitHub's new feature helps this
> > valuable work move forward. I look forward to seeing your PR merged,
> > and I am happy to help resolve any issues that might pop up.
> > 
> > Thanks,
> > Michael
> > 
> > On Tue, Mar 29, 2022 at 7:55 AM Lari Hotari <lhot...@apache.org> wrote:
> > >
> > > The PR has sufficient reviews, and I'll proceed with merging it today or 
> > > tomorrow.
> > > Please provide feedback now if you want to do that before the PR is 
> > > merged.
> > >
> > > Thanks!
> > >
> > > -Lari
> > >
> > > On 2022/03/28 20:05:14 Lari Hotari wrote:
> > > > The PR https://github.com/apache/pulsar/pull/14819 is now ready for 
> > > > review.
> > > >
> > > > The changes in the PR now use GitHub Actions Artifacts for sharing 
> > > > binary files (such as docker images) between the build steps. This 
> > > > saves a lot of GitHub Actions VM resources since the docker images are 
> > > > built once and shared in downstream jobs.
> > > > GitHub Actions Artifacts are meant to be used for sharing data between 
> > > > the jobs in a GitHub Actions workflow [1].
> > > >
> > > > I'm looking forward to your review and feedback on 
> > > > https://github.com/apache/pulsar/pull/14819 .
> > > >
> > > > BR,
> > > >
> > > > -Lari
> > > >
> > > > References:
> > > > [1] GitHub Actions: Storing workflow data as artifacts - 
> > > > https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts
> > > >
> > > > On 2022/03/23 10:46:10 Lari Hotari wrote:
> > > > > I have submitted the PR for refactoring the apache/pulsar GitHub 
> > > > > Actions based CI. Please review 
> > > > > https://github.com/apache/pulsar/pull/14819 .
> > > > >
> > > > > BR,
> > > > > -Lari
> > > > >
> > > > > On 2022/03/22 13:38:36 Enrico Olivelli wrote:
> > > > > > Lari,
> > > > > >
> > > > > > Il Mar 22 Mar 2022, 14:32 Lari Hotari <lhot...@apache.org> ha 
> > > > > > scritto:
> > > > > >
> > > > > > > I have resumed work to improve our GitHub Actions based Pulsar CI.
> > > > > > >
> > > > > > > Last year, I worked on a proof-of-concept which significantly 
> > > > > > > reduced the
> > > > > > > resource consumption and improved the usability of the build by 
> > > > > > > combining
> > > > > > > multiple workflows into a single larger workflow.
> > > > > > >
> > > > > > > The showstopper a year ago was the lack of being able to re-run a 
> > > > > > > single
> > > > > > > failed job in a larger workflow.
> > > > > > > GitHub has since then delivered this feature and no showstoppers 
> > > > > > > are
> > > > > > > present.
> > > > > > >
> > > > > > > I have been posting updates to
> > > > > > > https://github.com/apache/pulsar/issues/14401 "Speed up CI 
> > > > > > > workflows"
> > > > > > > about the progress.
> > > > > > > I have rebased the changes from last year's PoC, and I'm 
> > > > > > > finalizing and
> > > > > > > testing the changes in my fork under
> > > > > > > https://github.com/lhotari/pulsar/pull/59 . I'll send a PR to
> > > > > > > apache/pulsar, when the refactoring is ready.
> > > > > > >
> > > > > >
> > > > > > This is great news !
> > > > > >
> > > > > > Looking forward to your patch
> > > > > >
> > > > > > Enrico
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -Lari
> > > > > > >
> > > > > > > On 2021/03/16 01:10:52 Sijie Guo wrote:
> > > > > > > > > The prototype has demonstrated about 60% reduction in
> > > > > > > > resource consumption.
> > > > > > > >
> > > > > > > > It is hard to quantify. Merging them into one large workflow 
> > > > > > > > can result
> > > > > > > in
> > > > > > > > more failures. Re-running those failures can consume resources 
> > > > > > > > as well.
> > > > > > > >
> > > > > > > > > Isn't it urgent to resolve it?
> > > > > > > >
> > > > > > > > I think we are in a stage that gives us breathing room to fix 
> > > > > > > > flaky tests
> > > > > > > > and solve other problems, no?
> > > > > > > > I don't mean we stop the effort here. I mean we have other 
> > > > > > > > enhancements
> > > > > > > > that we can do to improve the situation.
> > > > > > > > Once we get into a position where the flakiness is reduced, we 
> > > > > > > > can merge
> > > > > > > > them into one workflow.
> > > > > > > >
> > > > > > > > - Sijie
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Mar 15, 2021 at 2:48 AM Lari Hotari <l...@hotari.net> 
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the feedback Sijie.
> > > > > > > > >
> > > > > > > > > > We are using a lazy consensus approach. Typically if there 
> > > > > > > > > > is no
> > > > > > > > > objection,
> > > > > > > > > > please go ahead and not need to wait for approval.
> > > > > > > > > > If people raise concerns, please address the concerns.
> > > > > > > > >
> > > > > > > > > You and Ali have raised concerns about changing the existing 
> > > > > > > > > GitHub
> > > > > > > Actions
> > > > > > > > > workflows in a way where multiple workflows would be combined 
> > > > > > > > > to a
> > > > > > > single
> > > > > > > > > workflow. Before proceeding, there is a need to address the 
> > > > > > > > > concerns.
> > > > > > > We
> > > > > > > > > might end up with a completely different type of solution of 
> > > > > > > > > what has
> > > > > > > been
> > > > > > > > > proposed initially. :)
> > > > > > > > >
> > > > > > > > > > Yes. So I am in favor of addressing flaky tests than 
> > > > > > > > > > merging all
> > > > > > > > > workflows
> > > > > > > > > > into one giant workflow.
> > > > > > > > >
> > > > > > > > > I agree that addressing flaky tests is favorable. The main 
> > > > > > > > > reason for
> > > > > > > PIP
> > > > > > > > > "Changes to GitHub Actions based Pulsar CI" is to
> > > > > > > > > 1) Reduce GitHub Action Runner resource consumption of Pulsar 
> > > > > > > > > PR builds
> > > > > > > > > 2) Reduce lead times for Pull Request feedback
> > > > > > > > > We cannot ignore these problems. If we don't change anything, 
> > > > > > > > > the
> > > > > > > problems
> > > > > > > > > won't get fixed. The prototype has demonstrated about 60% 
> > > > > > > > > reduction in
> > > > > > > > > resource consumption. Measuring the lead times hasn't been 
> > > > > > > > > done in the
> > > > > > > > > prototype, but since the reason for long lead times has been 
> > > > > > > > > long build
> > > > > > > > > queues due to excessive resource consumption, it's likely 
> > > > > > > > > that the lead
> > > > > > > > > times would be reduced.
> > > > > > > > >
> > > > > > > > > I know that switching to a single workflow isn't the only 
> > > > > > > > > solution to
> > > > > > > the
> > > > > > > > > above problems. I had a discussion with Ali. He recommended 
> > > > > > > > > reducing
> > > > > > > the
> > > > > > > > > modules in Pulsar repository (PIP-62), reducing the docker 
> > > > > > > > > container
> > > > > > > size
> > > > > > > > > and improving the Pulsar Broker unit test harness so that 
> > > > > > > > > tests using
> > > > > > > it
> > > > > > > > > would be less flaky and that it would be easier to fix the 
> > > > > > > > > issues in
> > > > > > > > > failing test when there would be better information about 
> > > > > > > > > what was the
> > > > > > > > > state problem that caused the test to fail.
> > > > > > > > >
> > > > > > > > > As mentioned in the earlier email about the optimizations in 
> > > > > > > > > the
> > > > > > > Pulsar CI
> > > > > > > > > refactoring prototype, the main benefits come from reusing 
> > > > > > > > > binary
> > > > > > > artifacts
> > > > > > > > > from previous build stages so that each job doesn't have to 
> > > > > > > > > build
> > > > > > > > > everything from scratch. This becomes irrelevant when the 
> > > > > > > > > build is very
> > > > > > > > > fast and there isn't a benefit of reusing artifacts.
> > > > > > > > > This means that it's possible to resolve the resource 
> > > > > > > > > consumption
> > > > > > > problem
> > > > > > > > > of Pulsar PR builds in the way that Ali is recommending, 
> > > > > > > > > without
> > > > > > > switching
> > > > > > > > > from multiple workflows to a single workflow that can reuse 
> > > > > > > > > binary
> > > > > > > > > artifacts in the build.
> > > > > > > > >
> > > > > > > > > > Hence I am +1 to "changes to flaky test handing" and 
> > > > > > > > > > suggest focusing
> > > > > > > > > more
> > > > > > > > > > on solving flaky tests.
> > > > > > > > > > Consider merging them into one workflow when the tests are 
> > > > > > > > > > in a
> > > > > > > better
> > > > > > > > > > situation.
> > > > > > > > >
> > > > > > > > > Makes sense for minimizing the risk of change, but we cannot 
> > > > > > > > > just wait
> > > > > > > for
> > > > > > > > > things to fix themselves.
> > > > > > > > > How long will other Apache projects tolerate the resource 
> > > > > > > > > consumption
> > > > > > > > > issues Pulsar is causing in the shared GitHub Actions Runner 
> > > > > > > > > VM quota?
> > > > > > > For
> > > > > > > > > example,
> > > > > > > https://github.com/apache/pulsar/pull/9159#issuecomment-766915396
> > > > > > > > > .
> > > > > > > > > Isn't it urgent to resolve it?
> > > > > > > > >
> > > > > > > > > I'll revisit the plan for PIP  "Changes to GitHub Actions 
> > > > > > > > > based Pulsar
> > > > > > > CI"
> > > > > > > > > based on the community feedback in the upcoming days. That 
> > > > > > > > > might mean
> > > > > > > that
> > > > > > > > > the current solution is pivoted. The goal is to solve the 
> > > > > > > > > problems of
> > > > > > > high
> > > > > > > > > resource consumption and long lead time for PR build in 
> > > > > > > > > Pulsar CI.
> > > > > > > Please
> > > > > > > > > continue to provide feedback so that we get a revisited plan 
> > > > > > > > > together
> > > > > > > that
> > > > > > > > > addresses these problems.Thank you!
> > > > > > > > >
> > > > > > > > > BR,
> > > > > > > > > -Lari
> > > > > > > > >
> > > > > > > > > On Fri, Mar 12, 2021 at 11:06 PM Sijie Guo 
> > > > > > > > > <guosi...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > > *Sijie, how far are we from getting the draft PIP 
> > > > > > > > > > > "Changes to
> > > > > > > GitHub
> > > > > > > > > > Actions based Pulsar CI" into an actual PIP that gets put 
> > > > > > > > > > on the wiki
> > > > > > > > > > page https://github.com/apache/pulsar/wiki
> > > > > > > > > > <https://github.com/apache/pulsar/wiki> ?*
> > > > > > > > > >
> > > > > > > > > > I see what you referred to before now. This can be easily 
> > > > > > > > > > done. I
> > > > > > > (or any
> > > > > > > > > > other committer) can do it for you.
> > > > > > > > > >
> > > > > > > > > > There is no real blocker for you to continue work even 
> > > > > > > > > > there are
> > > > > > > concerns
> > > > > > > > > > or it is not listed in the PIP.
> > > > > > > > > > We are using a lazy consensus approach. Typically if there 
> > > > > > > > > > is no
> > > > > > > > > objection,
> > > > > > > > > > please go ahead and not need to wait for approval.
> > > > > > > > > > If people raise concerns, please address the concerns.
> > > > > > > > > >
> > > > > > > > > > > The reason why
> > > > > > > > > > re-runs happen currently is because of the high flakiness 
> > > > > > > > > > of tests.
> > > > > > > > > >
> > > > > > > > > > Yes. So I am in favor of addressing flaky tests than 
> > > > > > > > > > merging all
> > > > > > > > > workflows
> > > > > > > > > > into one giant workflow.
> > > > > > > > > > It is not about "No pain, no gain". The community has 
> > > > > > > > > > suffered a lot
> > > > > > > with
> > > > > > > > > > giant workflow before.
> > > > > > > > > > There were a lot of committers and contributors working 
> > > > > > > > > > hard to
> > > > > > > split one
> > > > > > > > > > giant workflow into multiple
> > > > > > > > > > current workflows. Unless there is really strong evidence 
> > > > > > > > > > that
> > > > > > > merging
> > > > > > > > > them
> > > > > > > > > > back to one will improve
> > > > > > > > > > the entire CI experience, I will still have concerns about 
> > > > > > > > > > one giant
> > > > > > > > > > workflow approach.
> > > > > > > > > >
> > > > > > > > > > Hence I am +1 to "changes to flaky test handing" and 
> > > > > > > > > > suggest focusing
> > > > > > > > > more
> > > > > > > > > > on solving flaky tests.
> > > > > > > > > > Consider merging them into one workflow when the tests are 
> > > > > > > > > > in a
> > > > > > > better
> > > > > > > > > > situation.
> > > > > > > > > >
> > > > > > > > > > >  This solution would also require disabling
> > > > > > > > > > required status checks
> > > > > > > > > >
> > > > > > > > > > I don't think it is a good idea to disable status checks. 
> > > > > > > > > > We can
> > > > > > > consider
> > > > > > > > > > running "dark mode" but it will just overload the action 
> > > > > > > > > > quota.
> > > > > > > > > >
> > > > > > > > > > Another alternative is to mirror the pull requests into 
> > > > > > > > > > another
> > > > > > > Github
> > > > > > > > > > account to test that and get more concrete statistics on the
> > > > > > > flakiness
> > > > > > > > > rate
> > > > > > > > > > of one workflow approach.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 12, 2021 at 1:57 AM Lari Hotari 
> > > > > > > > > > <l...@hotari.net> wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the feedback, Sijie.
> > > > > > > > > > >
> > > > > > > > > > > > The "Fail fast" approach is great. That would be super 
> > > > > > > > > > > > helpful if
> > > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > > multiple workflows and each workflow is retryable.
> > > > > > > > > > > > However, I am not sure how much it will help if you run 
> > > > > > > > > > > > all
> > > > > > > workflows
> > > > > > > > > > in
> > > > > > > > > > > > one giant workflow. Or is it making things worse?
> > > > > > > > > > >
> > > > > > > > > > > We can reduce the need for re-running workflow runs. The 
> > > > > > > > > > > reason why
> > > > > > > > > > > re-runs happen currently is because of high flakiness of 
> > > > > > > > > > > tests.
> > > > > > > > > > > Addressing flakiness continues to be top-priority. Now 
> > > > > > > > > > > that the
> > > > > > > Pulsar
> > > > > > > > > CI
> > > > > > > > > > > workflow prototype is finished, I'll be focusing more in 
> > > > > > > > > > > the other
> > > > > > > > > draft
> > > > > > > > > > > PIP, "Changes to flaky test handling".
> > > > > > > > > > > We as a community should address the critical problem 
> > > > > > > > > > > that the
> > > > > > > current
> > > > > > > > > > > retry solution has: it can mask bugs in production code 
> > > > > > > > > > > and make
> > > > > > > the
> > > > > > > > > > build
> > > > > > > > > > > pass and allow changes to be merged that cause 
> > > > > > > > > > > regressions.
> > > > > > > > > > > It's a false sense of security what the green builds 
> > > > > > > > > > > after all the
> > > > > > > > > > retries
> > > > > > > > > > > bring us. Bringing Pulsar to the next level in stability 
> > > > > > > > > > > requires
> > > > > > > > > > > addressing this.
> > > > > > > > > > >
> > > > > > > > > > > If something doesn't work, it can be adapted and 
> > > > > > > > > > > improved. Changes
> > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > rolled back and revisited when things go worse. We also 
> > > > > > > > > > > need a
> > > > > > > leap of
> > > > > > > > > > > faith.
> > > > > > > > > > > "No pain, no gain", like any change, it will be painful 
> > > > > > > > > > > at first,
> > > > > > > but
> > > > > > > > > we
> > > > > > > > > > > will get over the bump.
> > > > > > > > > > >
> > > > > > > > > > > > Secondly, your test has been done in your folk where 
> > > > > > > > > > > > there are
> > > > > > > not a
> > > > > > > > > > lot
> > > > > > > > > > > of
> > > > > > > > > > > > concurrent pushes and pull requests. I am not sure how 
> > > > > > > > > > > > your
> > > > > > > approach
> > > > > > > > > > will
> > > > > > > > > > > > behave once it is merged into master. Can you simulate 
> > > > > > > > > > > > multiple
> > > > > > > > > > > concurrent
> > > > > > > > > > > > pull requests in your account to prove your approach 
> > > > > > > > > > > > doesn't
> > > > > > > bring
> > > > > > > > > side
> > > > > > > > > > > > effects?
> > > > > > > > > > >
> > > > > > > > > > > One possibility to address this is to introduce the new 
> > > > > > > > > > > workflow
> > > > > > > in a
> > > > > > > > > > mode
> > > > > > > > > > > where you need to opt-in to the new workflow in some way.
> > > > > > > > > > > This was an idea brought up by my colleagues Enrico and 
> > > > > > > > > > > Andrey.
> > > > > > > > > > > It might be possible to configure the existing workflow 
> > > > > > > > > > > and this
> > > > > > > new
> > > > > > > > > > > workflow in a way where some condition (for example 
> > > > > > > > > > > whitelisted
> > > > > > > github
> > > > > > > > > > user
> > > > > > > > > > > name or a certain keyword in the PR title/description) 
> > > > > > > > > > > chooses
> > > > > > > either
> > > > > > > > > one
> > > > > > > > > > > for the Pull request. This solution would also require 
> > > > > > > > > > > disabling
> > > > > > > > > > > required status checks ("Require status checks to pass 
> > > > > > > > > > > before
> > > > > > > merging"
> > > > > > > > > > > feature in GitHub branch protection rules) since the 
> > > > > > > > > > > names of the
> > > > > > > > > checks
> > > > > > > > > > > would be different.
> > > > > > > > > > >
> > > > > > > > > > > > Lastly, can we apply those optimizations to current 
> > > > > > > > > > > > workflows
> > > > > > > without
> > > > > > > > > > > > merging them into one giant workflow?
> > > > > > > > > > >
> > > > > > > > > > > This is what I have been doing. All individual 
> > > > > > > > > > > optimizations have
> > > > > > > > > already
> > > > > > > > > > > been sent as PRs in the last months. I guess there's 
> > > > > > > > > > > 20-30 PRs that
> > > > > > > > > have
> > > > > > > > > > > already been merged.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://github.com/apache/pulsar/pulls?q=is%3Apr+author%3Alhotari+is%3Amerged
> > > > > > > > > > > There's also 2 build related PRs from yesterday which 
> > > > > > > > > > > haven't been
> > > > > > > > > merged
> > > > > > > > > > > yet:
> > > > > > > > > > > Fix Maven download issues (ported from the prototype to 
> > > > > > > > > > > our
> > > > > > > existing
> > > > > > > > > > Pulsar
> > > > > > > > > > > CI): https://github.com/apache/pulsar/pull/9883
> > > > > > > > > > > improve Maven module build order (required for more 
> > > > > > > > > > > efficient
> > > > > > > builds
> > > > > > > > > that
> > > > > > > > > > > selectively build required artifacts):
> > > > > > > > > > > https://github.com/apache/pulsar/pull/9882
> > > > > > > > > > >
> > > > > > > > > > > There aren't many optimizations left that could be ported 
> > > > > > > > > > > from the
> > > > > > > > > > > prototype to the existing workflow. There are a few, but 
> > > > > > > > > > > the
> > > > > > > impact is
> > > > > > > > > > > minor.
> > > > > > > > > > > The reason for this is that the optimization with the 
> > > > > > > > > > > greatest
> > > > > > > impact
> > > > > > > > > are
> > > > > > > > > > > the ones that build a binary artifacts (maven libs, 
> > > > > > > > > > > docker images)
> > > > > > > once
> > > > > > > > > > and
> > > > > > > > > > > share it with the downstream jobs in the pipeline.
> > > > > > > > > > > Applying this type of solution has certain challenges 
> > > > > > > > > > > when there
> > > > > > > are
> > > > > > > > > > > multiple separate workflow. Sharing binary artifacts to 
> > > > > > > > > > > other
> > > > > > > workflows
> > > > > > > > > > > would require that the workflow to reuse the artifacts 
> > > > > > > > > > > gets
> > > > > > > triggered
> > > > > > > > > by
> > > > > > > > > > > the workflow that produced the artifacts. This wouldn't 
> > > > > > > > > > > be secure
> > > > > > > or
> > > > > > > > > > > practical for handling pull requests. Triggering a 
> > > > > > > > > > > workflow
> > > > > > > explicitly
> > > > > > > > > > > would require a token from the main repository and using 
> > > > > > > > > > > that for
> > > > > > > pull
> > > > > > > > > > > request builds would be a serious security vulnerability. 
> > > > > > > > > > > (more
> > > > > > > details
> > > > > > > > > > > about the GitHub Actions security model in
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://securitylab.github.com/research/github-actions-preventing-pwn-requests
> > > > > > > > > > > )
> > > > > > > > > > >
> > > > > > > > > > > *Sijie, how far are we from getting the draft PIP 
> > > > > > > > > > > "Changes to
> > > > > > > GitHub
> > > > > > > > > > > Actions based Pulsar CI" into an actual PIP that gets put 
> > > > > > > > > > > on the
> > > > > > > wiki
> > > > > > > > > > > page https://github.com/apache/pulsar/wiki
> > > > > > > > > > > <https://github.com/apache/pulsar/wiki> ?*
> > > > > > > > > > >
> > > > > > > > > > > -Lari
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 12, 2021 at 10:46 AM Sijie Guo 
> > > > > > > > > > > <guosi...@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > This is good progress. However, my main concern is 
> > > > > > > > > > > > still merging
> > > > > > > all
> > > > > > > > > > > > workflows into one giant workflow.
> > > > > > > > > > > >
> > > > > > > > > > > > The "Fail fast" approach is great. That would be super 
> > > > > > > > > > > > helpful if
> > > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > > multiple workflows and each workflow is retryable.
> > > > > > > > > > > > However, I am not sure how much it will help if you run 
> > > > > > > > > > > > all
> > > > > > > workflows
> > > > > > > > > > in
> > > > > > > > > > > > one giant workflow. Or is it making things worse?
> > > > > > > > > > > >
> > > > > > > > > > > > Secondly, your test has been done in your folk where 
> > > > > > > > > > > > there are
> > > > > > > not a
> > > > > > > > > > lot
> > > > > > > > > > > of
> > > > > > > > > > > > concurrent pushes and pull requests. I am not sure how 
> > > > > > > > > > > > your
> > > > > > > approach
> > > > > > > > > > will
> > > > > > > > > > > > behave once it is merged into master. Can you simulate 
> > > > > > > > > > > > multiple
> > > > > > > > > > > concurrent
> > > > > > > > > > > > pull requests in your account to prove your approach 
> > > > > > > > > > > > doesn't
> > > > > > > bring
> > > > > > > > > side
> > > > > > > > > > > > effects?
> > > > > > > > > > > >
> > > > > > > > > > > > Lastly, can we apply those optimizations to current 
> > > > > > > > > > > > workflows
> > > > > > > without
> > > > > > > > > > > > merging them into one giant workflow?
> > > > > > > > > > > >
> > > > > > > > > > > > - Sijie
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 12, 2021 at 12:30 AM Lari Hotari 
> > > > > > > > > > > > <l...@hotari.net>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the feedback Michael.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I left a question on the doc about how concurrent 
> > > > > > > > > > > > > > runs
> > > > > > > affect the
> > > > > > > > > > > > > > repository's 5 GB cache limit.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is a good question. There isn't a clear answer 
> > > > > > > > > > > > > in the
> > > > > > > GitHub
> > > > > > > > > > > Actions
> > > > > > > > > > > > > Cache documentation.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The documentation is
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows
> > > > > > > > > > > > > .
> > > > > > > > > > > > > Based on this document and some testing, I have made 
> > > > > > > > > > > > > these
> > > > > > > > > > conclusions:
> > > > > > > > > > > > > For GitHub Actions Cache, pull requests get executed 
> > > > > > > > > > > > > in the
> > > > > > > context
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > forked repository.
> > > > > > > > > > > > > The workflow triggered by a pull request event can 
> > > > > > > > > > > > > only update
> > > > > > > it's
> > > > > > > > > > own
> > > > > > > > > > > > > cache. It has read-only access to upstream caches.
> > > > > > > > > > > > > If there's a cache miss, the entry will get written 
> > > > > > > > > > > > > to the
> > > > > > > cache of
> > > > > > > > > > the
> > > > > > > > > > > > > forked repository. If the PR could to write to the 
> > > > > > > > > > > > > upstream
> > > > > > > cache,
> > > > > > > > > it
> > > > > > > > > > > > would
> > > > > > > > > > > > > be a security issue since this would be vulnerable to 
> > > > > > > > > > > > > cache
> > > > > > > > > poisoning
> > > > > > > > > > > > > attacks. Each repository has a 5GB quota for writes. 
> > > > > > > > > > > > > The
> > > > > > > entries
> > > > > > > > > are
> > > > > > > > > > > kept
> > > > > > > > > > > > > up to 7 days.
> > > > > > > > > > > > > The performance is fairly good. Loading docker images 
> > > > > > > > > > > > > from the
> > > > > > > > > > > repository
> > > > > > > > > > > > > happens about 15MB/s. Writing is 2-3x slower, about 
> > > > > > > > > > > > > 5-7MB/s.
> > > > > > > (the
> > > > > > > > > > > > > performance of the GHA cache is most likely higher 
> > > > > > > > > > > > > since this
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > > > throughput for docker load / docker save)
> > > > > > > > > > > > >
> > > > > > > > > > > > > If a single repository has a lot of concurrent jobs, 
> > > > > > > > > > > > > it could
> > > > > > > start
> > > > > > > > > > > > > evicting caches.
> > > > > > > > > > > > > However that isn't likely to happen with the way 
> > > > > > > > > > > > > Pulsar is
> > > > > > > > > developed
> > > > > > > > > > > > since
> > > > > > > > > > > > > pull requests are created from personal forks.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I also think it could be helpful to explicitly 
> > > > > > > > > > > > > > document, or
> > > > > > > > > > reference
> > > > > > > > > > > > > > github documentation, on how failure will affect 
> > > > > > > > > > > > > > the DAG. I'm
> > > > > > > > > > > assuming
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > if an action fails, its parallel peer actions will 
> > > > > > > > > > > > > > run to
> > > > > > > > > > completion,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > that the rest of the remaining stages will get 
> > > > > > > > > > > > > > canceled, but
> > > > > > > I
> > > > > > > > > > > haven't
> > > > > > > > > > > > > > worked with github actions before.
> > > > > > > > > > > > >
> > > > > > > > > > > > > For matrix jobs, "fail fast" is the default, which 
> > > > > > > > > > > > > cancels all
> > > > > > > jobs
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > matrix if one fails. Other parallel flows would run to
> > > > > > > completion
> > > > > > > > > by
> > > > > > > > > > > > > default.
> > > > > > > > > > > > > In the prototype, I have added a Github script step 
> > > > > > > > > > > > > to each
> > > > > > > job to
> > > > > > > > > > > cancel
> > > > > > > > > > > > > the complete workflow when a failure occurs.
> > > > > > > > > > > > > Here's an example:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://github.com/lhotari/pulsar/blob/lh-refactor-pulsar-ci-with-retries/.github/workflows/pulsar-ci.yaml#L281-L289
> > > > > > > > > > > > >
> > > > > > > > > > > > > The prototype follows a "fail fast" design. When a 
> > > > > > > > > > > > > failure
> > > > > > > occurs,
> > > > > > > > > > fail
> > > > > > > > > > > > > fast and don't continue with other jobs.
> > > > > > > > > > > > > The benefit of this is that it reduces resource 
> > > > > > > > > > > > > consumption.
> > > > > > > This
> > > > > > > > > > helps
> > > > > > > > > > > > > keep the build queue short.
> > > > > > > > > > > > > When the build queue is short, developers get quick 
> > > > > > > > > > > > > feedback
> > > > > > > from
> > > > > > > > > CI.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Documenting all details in the PIP document isn't 
> > > > > > > > > > > > > practical.
> > > > > > > > > > > > > *I'm hoping to start a separate document on low level 
> > > > > > > > > > > > > details
> > > > > > > when
> > > > > > > > > > > there
> > > > > > > > > > > > is
> > > > > > > > > > > > > a high level acceptance of the proposed "Changes to 
> > > > > > > > > > > > > GitHub
> > > > > > > Actions
> > > > > > > > > > > based
> > > > > > > > > > > > > Pulsar CI".*
> > > > > > > > > > > > > Together we can make this happen. We need decisions 
> > > > > > > > > > > > > too. This
> > > > > > > > > > proposal
> > > > > > > > > > > > > cannot stay as a draft forever.
> > > > > > > > > > > > > I'm looking forward to hearing from the Pulsar 
> > > > > > > > > > > > > community,
> > > > > > > Pulsar
> > > > > > > > > > > > > committer and Pulsar PMC members how to take this 
> > > > > > > > > > > > > forward.
> > > > > > > > > > > > >
> > > > > > > > > > > > > BR,
> > > > > > > > > > > > > -Lari
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Mar 12, 2021 at 8:06 AM Michael Marshall <
> > > > > > > > > > > mikemars...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > This will be a great improvement. I read through 
> > > > > > > > > > > > > > the PIP, and
> > > > > > > > > > > overall,
> > > > > > > > > > > > it
> > > > > > > > > > > > > > looks good to me.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I left a question on the doc about how concurrent 
> > > > > > > > > > > > > > runs
> > > > > > > affect the
> > > > > > > > > > > > > > repository's 5 GB cache limit.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I also think it could be helpful to explicitly 
> > > > > > > > > > > > > > document, or
> > > > > > > > > > reference
> > > > > > > > > > > > > > github documentation, on how failure will affect 
> > > > > > > > > > > > > > the DAG. I'm
> > > > > > > > > > > assuming
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > if an action fails, its parallel peer actions will 
> > > > > > > > > > > > > > run to
> > > > > > > > > > completion,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > that the rest of the remaining stages will get 
> > > > > > > > > > > > > > canceled, but
> > > > > > > I
> > > > > > > > > > > haven't
> > > > > > > > > > > > > > worked with github actions before.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for all of the work you've put in so far.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 11, 2021 at 6:37 PM Yuva raj 
> > > > > > > > > > > > > > <uvar...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This is great news. Thanks Hari , Mateo and pulsar
> > > > > > > community
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Mar 12, 2021, 2:04 AM Lari Hotari <
> > > > > > > > > lari.hot...@sagire.fi
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Dear Pulsar community members,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The work on "Changes to GitHub Actions based 
> > > > > > > > > > > > > > > > Pulsar CI"
> > > > > > > has
> > > > > > > > > > gone
> > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > based on your feedback. Here are some updates 
> > > > > > > > > > > > > > > > about the
> > > > > > > work.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The draft PIP proposal document is here:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry
> > > > > > > > > > > > > > > > There's a *detailed status update in the 
> > > > > > > > > > > > > > > > document about a
> > > > > > > > > > > prototype
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > refactored Pulsar CI GitHub Actions based 
> > > > > > > > > > > > > > > > workflow*.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for all the suggestions and feedback by 
> > > > > > > > > > > > > > > > now. A
> > > > > > > lot of
> > > > > > > > > > > > > > improvements
> > > > > > > > > > > > > > > > have been made by the Pulsar contributors to 
> > > > > > > > > > > > > > > > overcome the
> > > > > > > > > > > technical
> > > > > > > > > > > > > > > > obstacles.
> > > > > > > > > > > > > > > > Special thanks go to Matteo for reducing the 
> > > > > > > > > > > > > > > > sizes of
> > > > > > > docker
> > > > > > > > > > > > images.
> > > > > > > > > > > > > A
> > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > of small improvements have been made to the 
> > > > > > > > > > > > > > > > Pulsar maven
> > > > > > > > > build
> > > > > > > > > > to
> > > > > > > > > > > > > > enable
> > > > > > > > > > > > > > > > the new refactored GitHub Actions workflow. 
> > > > > > > > > > > > > > > > Thank you
> > > > > > > for all
> > > > > > > > > > PR
> > > > > > > > > > > > > > reviews
> > > > > > > > > > > > > > > > and feedback.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The main goal of the "Changes to GitHub Actions 
> > > > > > > > > > > > > > > > based
> > > > > > > Pulsar
> > > > > > > > > > CI"
> > > > > > > > > > > > work
> > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > been to *reduce the resource consumption of the 
> > > > > > > > > > > > > > > > Pulsar CI
> > > > > > > > > build
> > > > > > > > > > > and
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > speed up Pulsar development by improving the 
> > > > > > > > > > > > > > > > developer
> > > > > > > > > > > > productivity*
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > less time is wasted in waiting for Pulsar CI 
> > > > > > > > > > > > > > > > build
> > > > > > > feedback.
> > > > > > > > > > The
> > > > > > > > > > > > > > > prototype
> > > > > > > > > > > > > > > > demonstrates these improvements.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > As you can see from the email from Jan 28 
> > > > > > > > > > > > > > > > below, *the
> > > > > > > > > resource
> > > > > > > > > > > > > > > consumption
> > > > > > > > > > > > > > > > was 19 hrs 36 minutes* for a single pull 
> > > > > > > > > > > > > > > > request that was
> > > > > > > > > > > observed
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > work began.
> > > > > > > > > > > > > > > > Now, with the prototype of the refactored 
> > > > > > > > > > > > > > > > Pulsar CI
> > > > > > > build,
> > > > > > > > > the
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > consumption is *7 hrs 9 minutes.*
> > > > > > > > > > > > > > > > *This is about 60% reduction in resource 
> > > > > > > > > > > > > > > > consumption.*
> > > > > > > The
> > > > > > > > > > whole
> > > > > > > > > > > > > > pipeline
> > > > > > > > > > > > > > > > completes in 75-100 minutes.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Here's a breakdown of the duration (resource
> > > > > > > consumption) of
> > > > > > > > > > each
> > > > > > > > > > > > > build
> > > > > > > > > > > > > > > job
> > > > > > > > > > > > > > > > in the refactored workflow:
> > > > > > > > > > > > > > > > Workflow Job seconds h:mm:ss
> > > > > > > > > > > > > > > > Pulsar CI Changed files check 4 0:00:04
> > > > > > > > > > > > > > > > Pulsar CI Go 1.11 Functions 155 0:02:35
> > > > > > > > > > > > > > > > Pulsar CI Go 1.12 Functions 166 0:02:46
> > > > > > > > > > > > > > > > Pulsar CI Go 1.13 Functions 113 0:01:53
> > > > > > > > > > > > > > > > Pulsar CI Go 1.14 Functions 96 0:01:36
> > > > > > > > > > > > > > > > Pulsar CI Build on MacOS 1017 0:16:57
> > > > > > > > > > > > > > > > Pulsar CI Build and License check 346 0:05:46
> > > > > > > > > > > > > > > > Pulsar CI Build Pulsar CPP and Python clients 
> > > > > > > > > > > > > > > > 683 0:11:23
> > > > > > > > > > > > > > > > Pulsar CI Build Pulsar java-test-image docker 
> > > > > > > > > > > > > > > > image 405
> > > > > > > > > 0:06:45
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Other 1580 0:26:20
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Broker Group 1 
> > > > > > > > > > > > > > > > 968
> > > > > > > 0:16:08
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Broker Group 2 
> > > > > > > > > > > > > > > > 2223
> > > > > > > 0:37:03
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Client Api 1652 
> > > > > > > > > > > > > > > > 0:27:32
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Client Impl 916 
> > > > > > > > > > > > > > > > 0:15:16
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Other 522 
> > > > > > > > > > > > > > > > 0:08:42
> > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Proxy 331 0:05:31
> > > > > > > > > > > > > > > > Pulsar CI Build Pulsar docker image 2343 0:39:03
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Shade 414 0:06:54
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Backwards 
> > > > > > > > > > > > > > > > Compatibility 849
> > > > > > > > > > 0:14:09
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Cli 1490 0:24:50
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Messaging 857 
> > > > > > > > > > > > > > > > 0:14:17
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Schema 468 0:07:48
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Standalone 286 
> > > > > > > > > > > > > > > > 0:04:46
> > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Transaction 362 
> > > > > > > > > > > > > > > > 0:06:02
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Function State 699 
> > > > > > > > > > > > > > > > 0:11:39
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Tiered FileSystem 779 
> > > > > > > > > > > > > > > > 0:12:59
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Tiered JCloud 529 
> > > > > > > > > > > > > > > > 0:08:49
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Pulsar Connectors - 
> > > > > > > > > > > > > > > > Thread 1795
> > > > > > > > > 0:29:55
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Pulsar Connectors - 
> > > > > > > > > > > > > > > > Process 2312
> > > > > > > > > > 0:38:32
> > > > > > > > > > > > > > > > Pulsar CI CI - System - Sql 1377 0:22:57
> > > > > > > > > > > > > > > > *Total resource consumption*
> > > > > > > > > > > > > > > > 7:08:57
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > GitHub Actions doesn't support restarting a 
> > > > > > > > > > > > > > > > single job (
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://github.community/t/ability-to-rerun-just-a-single-job-in-a-workflow/17234
> > > > > > > > > > > > > > > > ).
> > > > > > > > > > > > > > > > However, this is not a showstopper since there 
> > > > > > > > > > > > > > > > are ways
> > > > > > > to
> > > > > > > > > > > address
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > issues that cause flakiness.
> > > > > > > > > > > > > > > > There is a separate PIP for changing the way to 
> > > > > > > > > > > > > > > > handle
> > > > > > > flaky
> > > > > > > > > > > tests.
> > > > > > > > > > > > > You
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > find the link to that in the "Changes to GitHub 
> > > > > > > > > > > > > > > > Actions
> > > > > > > based
> > > > > > > > > > > > Pulsar
> > > > > > > > > > > > > > CI"
> > > > > > > > > > > > > > > > document's header.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > *Some requests for the Pulsar community:*
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1) *Please take a look at the updated PIP 
> > > > > > > > > > > > > > > > document*:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry
> > > > > > > > > > > > > > > > . *It also contains more details of the 
> > > > > > > > > > > > > > > > prototype that
> > > > > > > has
> > > > > > > > > been
> > > > > > > > > > > > > > > > successfully completed.*
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2) *Please share your feedback and suggest a way
> > > > > > > forward.*
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > *Thank you for your help!*
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > BR, Lari
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Jan 28, 2021 at 7:13 PM Lari Hotari <
> > > > > > > > > > > lari.hot...@sagire.fi
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Dear Pulsar community members,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Currently, the Pulsar GitHub Actions 
> > > > > > > > > > > > > > > > > workflows are
> > > > > > > > > consuming
> > > > > > > > > > > the
> > > > > > > > > > > > > > > majority
> > > > > > > > > > > > > > > > > of the shared pool of resources allocated for
> > > > > > > > > > > github.com/apache
> > > > > > > > > > > > > > > > projects.
> > > > > > > > > > > > > > > > > Other Apache projects have been impacted and 
> > > > > > > > > > > > > > > > > there is a
> > > > > > > > > > demand
> > > > > > > > > > > to
> > > > > > > > > > > > > > > improve
> > > > > > > > > > > > > > > > > the Pulsar CI
> > > > > > > > > > > > > > > > > <
> > > > > > > > > > > >
> > > > > > > https://github.com/apache/pulsar/pull/9159#issuecomment-766915396
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > asap.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > In GitHub Actions Runners, the unit of 
> > > > > > > > > > > > > > > > > resources is the
> > > > > > > > > time
> > > > > > > > > > > > that a
> > > > > > > > > > > > > > > > Runner
> > > > > > > > > > > > > > > > > is occupied. I observed the workflow runs for 
> > > > > > > > > > > > > > > > > handling
> > > > > > > a
> > > > > > > > > > single
> > > > > > > > > > > > > Pull
> > > > > > > > > > > > > > > > > Request (in my personal fork) and these were 
> > > > > > > > > > > > > > > > > the
> > > > > > > running
> > > > > > > > > > > > durations:
> > > > > > > > > > > > > > > > > Workflow name Duration
> > > > > > > > > > > > > > > > > CI - Build - MacOS 0:17:23
> > > > > > > > > > > > > > > > > CI - Go Functions style check 0:02:38
> > > > > > > > > > > > > > > > > CI - Unit - Brokers - Other 0:15:40
> > > > > > > > > > > > > > > > > CI - Unit - Brokers - Client Impl 0:16:28
> > > > > > > > > > > > > > > > > CI - Misc 0:16:51
> > > > > > > > > > > > > > > > > CI - Unit - Proxy 0:14:23
> > > > > > > > > > > > > > > > > CI - Go Functions Tests 0:22:08
> > > > > > > > > > > > > > > > > CI - CPP, Python Tests 0:23:30
> > > > > > > > > > > > > > > > > CI - Unit 0:42:11
> > > > > > > > > > > > > > > > > CI - Integration - Sql 1:00:13
> > > > > > > > > > > > > > > > > CI - Integration - Tiered JCloud 1:00:18
> > > > > > > > > > > > > > > > > CI - Integration - Tiered FileSystem 1:00:13
> > > > > > > > > > > > > > > > > CI - Integration - Function State 1:00:12
> > > > > > > > > > > > > > > > > CI - Integration - Cli 1:10:22
> > > > > > > > > > > > > > > > > CI - Integration - Transaction 1:16:34
> > > > > > > > > > > > > > > > > CI - Integration - Process 1:11:23
> > > > > > > > > > > > > > > > > CI - Shade - Test 1:15:45
> > > > > > > > > > > > > > > > > CI - Unit - Brokers - Client Api 0:26:13
> > > > > > > > > > > > > > > > > CI - Unit - Brokers - Broker Group 2 0:35:05
> > > > > > > > > > > > > > > > > CI - Integration - Standalone 0:45:29
> > > > > > > > > > > > > > > > > CI - Integration - Messaging 1:00:23
> > > > > > > > > > > > > > > > > CI - Integration - Thread 1:00:19
> > > > > > > > > > > > > > > > > CI - Integration - Backwards Compatibility 
> > > > > > > > > > > > > > > > > 1:00:19
> > > > > > > > > > > > > > > > > CI - Integration - Schema 1:00:19
> > > > > > > > > > > > > > > > > CI - Unit - Brokers - Broker Group 1 2:02:31
> > > > > > > > > > > > > > > > > TOTAL 19:36:50
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > *In this case, the total resource consumption 
> > > > > > > > > > > > > > > > > of GitHub
> > > > > > > > > > Actions
> > > > > > > > > > > > > > Runners
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > 19 hours 36 minutes 50 seconds for a single 
> > > > > > > > > > > > > > > > > pull
> > > > > > > request to
> > > > > > > > > > > > > > > > apache/pulsar.*
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Since GitHub Actions Runner resource pool 
> > > > > > > > > > > > > > > > > utilization
> > > > > > > is
> > > > > > > > > very
> > > > > > > > > > > > high,
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > leads to the build queue to grow and take a 
> > > > > > > > > > > > > > > > > long time
> > > > > > > to
> > > > > > > > > > > process.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I have been looking for ways to improve the 
> > > > > > > > > > > > > > > > > Pulsar CI
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > last
> > > > > > > > > > > > > 3
> > > > > > > > > > > > > > > > > months. During this period I worked on a few
> > > > > > > experiments.
> > > > > > > > > The
> > > > > > > > > > > > > > learnings
> > > > > > > > > > > > > > > > > from the past experiments are documented at a 
> > > > > > > > > > > > > > > > > high
> > > > > > > level in
> > > > > > > > > > the
> > > > > > > > > > > > > > > following
> > > > > > > > > > > > > > > > > draft PIP document.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > *The draft PIP "Changes to GitHub Actions 
> > > > > > > > > > > > > > > > > based Pulsar
> > > > > > > CI"
> > > > > > > > > > > > document
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > Google doc:*
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit?usp=sharing
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > *Please participate* so that we get the plan 
> > > > > > > > > > > > > > > > > adjusted
> > > > > > > based
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > feedback asap. If there's already a similar 
> > > > > > > > > > > > > > > > > effort
> > > > > > > > > ongoing, I
> > > > > > > > > > > > hope
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > join efforts.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > *Let's fix Pulsar CI!*
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > BR, Lari
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > 
> 

Reply via email to