Lari, I would start a new thread with an 'Heads up' People may not follow this long thread and miss this piece of information
Great work! Thank you very much Enrico Il Mer 30 Mar 2022, 18:32 Lari Hotari <lhot...@apache.org> ha scritto: > The refactored Pulsar CI workflow PR > https://github.com/apache/pulsar/pull/14819 has been merged. It unblocks > CI and makes it work again. Pulsar SQL integration tests are disabled > temporarily until https://github.com/apache/pulsar/issues/14951 has been > addressed. > > Please rebase your PR or close/reopen it to pick up changes for the new > Pulsar CI workflow. That is necessary so that PRs can be merged. > > -Lari > > > On 2022/03/30 14:08:17 Lari Hotari wrote: > > > > Merging the PR is blocked by > https://github.com/apache/pulsar/issues/14951 . > > > > Pulsar SQL doesn't work with Java 11.0.14.1 version. It fails with this > error message: > > Exception in thread "main" java.lang.IllegalArgumentException: Cannot > parse version 11.0.14.1 > > at io.prestosql.server.JavaVersion.parse(JavaVersion.java:76) > > at > io.prestosql.server.PrestoSystemRequirements.verifyJavaVersion(PrestoSystemRequirements.java:102) > > at > io.prestosql.server.PrestoSystemRequirements.verifyJvmRequirements(PrestoSystemRequirements.java:45) > > at io.prestosql.server.PrestoServer.run(PrestoServer.java:78) > > at io.prestosql.$gen.Presto_332____20220330_100314_1.run(Unknown > Source) > > at io.prestosql.server.PrestoServer.main(PrestoServer.java:72) > > > > I'll apply a workaround to unblock CI. > > > > -Lari > > > > On 2022/03/30 06:52:38 Lari Hotari wrote: > > > Thank you for the reviews and feedback. I have started making the > switch to the new refactored Pulsar CI. > > > > > > Merging new PRs are blocked until the switch is ready. The reason for > this is that I have merged https://github.com/apache/pulsar/pull/14939 > preparing for merging https://github.com/apache/pulsar/pull/14819 . > > > > > > The GitHub Actions "required checks" change in the refactored Pulsar > CI and there can be only one effective set of "required checks" for a > branch. > > > > > > After the new Pulsar CI workflow PR has been merged, each in-progress > PR has to be closed & immediately reopened to pick up the new workflow and > the PR build has to run through the new workflow. Another way to pick up > the new workflow is to rebase the PR (or merge master branch changes to it). > > > > > > Please let me know if you experience any issues with the new Pulsar CI > workflow. I'll be on the #testing channel on Pulsar Slack too. > > > > > > -Lari > > > > > > On 2022/03/29 15:43:31 Michael Marshall wrote: > > > > Great work, Lari! It's great news that GitHub's new feature helps > this > > > > valuable work move forward. I look forward to seeing your PR merged, > > > > and I am happy to help resolve any issues that might pop up. > > > > > > > > Thanks, > > > > Michael > > > > > > > > On Tue, Mar 29, 2022 at 7:55 AM Lari Hotari <lhot...@apache.org> > wrote: > > > > > > > > > > The PR has sufficient reviews, and I'll proceed with merging it > today or tomorrow. > > > > > Please provide feedback now if you want to do that before the PR > is merged. > > > > > > > > > > Thanks! > > > > > > > > > > -Lari > > > > > > > > > > On 2022/03/28 20:05:14 Lari Hotari wrote: > > > > > > The PR https://github.com/apache/pulsar/pull/14819 is now ready > for review. > > > > > > > > > > > > The changes in the PR now use GitHub Actions Artifacts for > sharing binary files (such as docker images) between the build steps. This > saves a lot of GitHub Actions VM resources since the docker images are > built once and shared in downstream jobs. > > > > > > GitHub Actions Artifacts are meant to be used for sharing data > between the jobs in a GitHub Actions workflow [1]. > > > > > > > > > > > > I'm looking forward to your review and feedback on > https://github.com/apache/pulsar/pull/14819 . > > > > > > > > > > > > BR, > > > > > > > > > > > > -Lari > > > > > > > > > > > > References: > > > > > > [1] GitHub Actions: Storing workflow data as artifacts - > https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts > > > > > > > > > > > > On 2022/03/23 10:46:10 Lari Hotari wrote: > > > > > > > I have submitted the PR for refactoring the apache/pulsar > GitHub Actions based CI. Please review > https://github.com/apache/pulsar/pull/14819 . > > > > > > > > > > > > > > BR, > > > > > > > -Lari > > > > > > > > > > > > > > On 2022/03/22 13:38:36 Enrico Olivelli wrote: > > > > > > > > Lari, > > > > > > > > > > > > > > > > Il Mar 22 Mar 2022, 14:32 Lari Hotari <lhot...@apache.org> > ha scritto: > > > > > > > > > > > > > > > > > I have resumed work to improve our GitHub Actions based > Pulsar CI. > > > > > > > > > > > > > > > > > > Last year, I worked on a proof-of-concept which > significantly reduced the > > > > > > > > > resource consumption and improved the usability of the > build by combining > > > > > > > > > multiple workflows into a single larger workflow. > > > > > > > > > > > > > > > > > > The showstopper a year ago was the lack of being able to > re-run a single > > > > > > > > > failed job in a larger workflow. > > > > > > > > > GitHub has since then delivered this feature and no > showstoppers are > > > > > > > > > present. > > > > > > > > > > > > > > > > > > I have been posting updates to > > > > > > > > > https://github.com/apache/pulsar/issues/14401 "Speed up > CI workflows" > > > > > > > > > about the progress. > > > > > > > > > I have rebased the changes from last year's PoC, and I'm > finalizing and > > > > > > > > > testing the changes in my fork under > > > > > > > > > https://github.com/lhotari/pulsar/pull/59 . I'll send a > PR to > > > > > > > > > apache/pulsar, when the refactoring is ready. > > > > > > > > > > > > > > > > > > > > > > > > > This is great news ! > > > > > > > > > > > > > > > > Looking forward to your patch > > > > > > > > > > > > > > > > Enrico > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Lari > > > > > > > > > > > > > > > > > > On 2021/03/16 01:10:52 Sijie Guo wrote: > > > > > > > > > > > The prototype has demonstrated about 60% reduction in > > > > > > > > > > resource consumption. > > > > > > > > > > > > > > > > > > > > It is hard to quantify. Merging them into one large > workflow can result > > > > > > > > > in > > > > > > > > > > more failures. Re-running those failures can consume > resources as well. > > > > > > > > > > > > > > > > > > > > > Isn't it urgent to resolve it? > > > > > > > > > > > > > > > > > > > > I think we are in a stage that gives us breathing room > to fix flaky tests > > > > > > > > > > and solve other problems, no? > > > > > > > > > > I don't mean we stop the effort here. I mean we have > other enhancements > > > > > > > > > > that we can do to improve the situation. > > > > > > > > > > Once we get into a position where the flakiness is > reduced, we can merge > > > > > > > > > > them into one workflow. > > > > > > > > > > > > > > > > > > > > - Sijie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 15, 2021 at 2:48 AM Lari Hotari < > l...@hotari.net> wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback Sijie. > > > > > > > > > > > > > > > > > > > > > > > We are using a lazy consensus approach. Typically if > there is no > > > > > > > > > > > objection, > > > > > > > > > > > > please go ahead and not need to wait for approval. > > > > > > > > > > > > If people raise concerns, please address the > concerns. > > > > > > > > > > > > > > > > > > > > > > You and Ali have raised concerns about changing the > existing GitHub > > > > > > > > > Actions > > > > > > > > > > > workflows in a way where multiple workflows would be > combined to a > > > > > > > > > single > > > > > > > > > > > workflow. Before proceeding, there is a need to > address the concerns. > > > > > > > > > We > > > > > > > > > > > might end up with a completely different type of > solution of what has > > > > > > > > > been > > > > > > > > > > > proposed initially. :) > > > > > > > > > > > > > > > > > > > > > > > Yes. So I am in favor of addressing flaky tests than > merging all > > > > > > > > > > > workflows > > > > > > > > > > > > into one giant workflow. > > > > > > > > > > > > > > > > > > > > > > I agree that addressing flaky tests is favorable. The > main reason for > > > > > > > > > PIP > > > > > > > > > > > "Changes to GitHub Actions based Pulsar CI" is to > > > > > > > > > > > 1) Reduce GitHub Action Runner resource consumption of > Pulsar PR builds > > > > > > > > > > > 2) Reduce lead times for Pull Request feedback > > > > > > > > > > > We cannot ignore these problems. If we don't change > anything, the > > > > > > > > > problems > > > > > > > > > > > won't get fixed. The prototype has demonstrated about > 60% reduction in > > > > > > > > > > > resource consumption. Measuring the lead times hasn't > been done in the > > > > > > > > > > > prototype, but since the reason for long lead times > has been long build > > > > > > > > > > > queues due to excessive resource consumption, it's > likely that the lead > > > > > > > > > > > times would be reduced. > > > > > > > > > > > > > > > > > > > > > > I know that switching to a single workflow isn't the > only solution to > > > > > > > > > the > > > > > > > > > > > above problems. I had a discussion with Ali. He > recommended reducing > > > > > > > > > the > > > > > > > > > > > modules in Pulsar repository (PIP-62), reducing the > docker container > > > > > > > > > size > > > > > > > > > > > and improving the Pulsar Broker unit test harness so > that tests using > > > > > > > > > it > > > > > > > > > > > would be less flaky and that it would be easier to fix > the issues in > > > > > > > > > > > failing test when there would be better information > about what was the > > > > > > > > > > > state problem that caused the test to fail. > > > > > > > > > > > > > > > > > > > > > > As mentioned in the earlier email about the > optimizations in the > > > > > > > > > Pulsar CI > > > > > > > > > > > refactoring prototype, the main benefits come from > reusing binary > > > > > > > > > artifacts > > > > > > > > > > > from previous build stages so that each job doesn't > have to build > > > > > > > > > > > everything from scratch. This becomes irrelevant when > the build is very > > > > > > > > > > > fast and there isn't a benefit of reusing artifacts. > > > > > > > > > > > This means that it's possible to resolve the resource > consumption > > > > > > > > > problem > > > > > > > > > > > of Pulsar PR builds in the way that Ali is > recommending, without > > > > > > > > > switching > > > > > > > > > > > from multiple workflows to a single workflow that can > reuse binary > > > > > > > > > > > artifacts in the build. > > > > > > > > > > > > > > > > > > > > > > > Hence I am +1 to "changes to flaky test handing" and > suggest focusing > > > > > > > > > > > more > > > > > > > > > > > > on solving flaky tests. > > > > > > > > > > > > Consider merging them into one workflow when the > tests are in a > > > > > > > > > better > > > > > > > > > > > > situation. > > > > > > > > > > > > > > > > > > > > > > Makes sense for minimizing the risk of change, but we > cannot just wait > > > > > > > > > for > > > > > > > > > > > things to fix themselves. > > > > > > > > > > > How long will other Apache projects tolerate the > resource consumption > > > > > > > > > > > issues Pulsar is causing in the shared GitHub Actions > Runner VM quota? > > > > > > > > > For > > > > > > > > > > > example, > > > > > > > > > > https://github.com/apache/pulsar/pull/9159#issuecomment-766915396 > > > > > > > > > > > . > > > > > > > > > > > Isn't it urgent to resolve it? > > > > > > > > > > > > > > > > > > > > > > I'll revisit the plan for PIP "Changes to GitHub > Actions based Pulsar > > > > > > > > > CI" > > > > > > > > > > > based on the community feedback in the upcoming days. > That might mean > > > > > > > > > that > > > > > > > > > > > the current solution is pivoted. The goal is to solve > the problems of > > > > > > > > > high > > > > > > > > > > > resource consumption and long lead time for PR build > in Pulsar CI. > > > > > > > > > Please > > > > > > > > > > > continue to provide feedback so that we get a > revisited plan together > > > > > > > > > that > > > > > > > > > > > addresses these problems.Thank you! > > > > > > > > > > > > > > > > > > > > > > BR, > > > > > > > > > > > -Lari > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 11:06 PM Sijie Guo < > guosi...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > *Sijie, how far are we from getting the draft PIP > "Changes to > > > > > > > > > GitHub > > > > > > > > > > > > Actions based Pulsar CI" into an actual PIP that > gets put on the wiki > > > > > > > > > > > > page https://github.com/apache/pulsar/wiki > > > > > > > > > > > > <https://github.com/apache/pulsar/wiki> ?* > > > > > > > > > > > > > > > > > > > > > > > > I see what you referred to before now. This can be > easily done. I > > > > > > > > > (or any > > > > > > > > > > > > other committer) can do it for you. > > > > > > > > > > > > > > > > > > > > > > > > There is no real blocker for you to continue work > even there are > > > > > > > > > concerns > > > > > > > > > > > > or it is not listed in the PIP. > > > > > > > > > > > > We are using a lazy consensus approach. Typically if > there is no > > > > > > > > > > > objection, > > > > > > > > > > > > please go ahead and not need to wait for approval. > > > > > > > > > > > > If people raise concerns, please address the > concerns. > > > > > > > > > > > > > > > > > > > > > > > > > The reason why > > > > > > > > > > > > re-runs happen currently is because of the high > flakiness of tests. > > > > > > > > > > > > > > > > > > > > > > > > Yes. So I am in favor of addressing flaky tests than > merging all > > > > > > > > > > > workflows > > > > > > > > > > > > into one giant workflow. > > > > > > > > > > > > It is not about "No pain, no gain". The community > has suffered a lot > > > > > > > > > with > > > > > > > > > > > > giant workflow before. > > > > > > > > > > > > There were a lot of committers and contributors > working hard to > > > > > > > > > split one > > > > > > > > > > > > giant workflow into multiple > > > > > > > > > > > > current workflows. Unless there is really strong > evidence that > > > > > > > > > merging > > > > > > > > > > > them > > > > > > > > > > > > back to one will improve > > > > > > > > > > > > the entire CI experience, I will still have concerns > about one giant > > > > > > > > > > > > workflow approach. > > > > > > > > > > > > > > > > > > > > > > > > Hence I am +1 to "changes to flaky test handing" and > suggest focusing > > > > > > > > > > > more > > > > > > > > > > > > on solving flaky tests. > > > > > > > > > > > > Consider merging them into one workflow when the > tests are in a > > > > > > > > > better > > > > > > > > > > > > situation. > > > > > > > > > > > > > > > > > > > > > > > > > This solution would also require disabling > > > > > > > > > > > > required status checks > > > > > > > > > > > > > > > > > > > > > > > > I don't think it is a good idea to disable status > checks. We can > > > > > > > > > consider > > > > > > > > > > > > running "dark mode" but it will just overload the > action quota. > > > > > > > > > > > > > > > > > > > > > > > > Another alternative is to mirror the pull requests > into another > > > > > > > > > Github > > > > > > > > > > > > account to test that and get more concrete > statistics on the > > > > > > > > > flakiness > > > > > > > > > > > rate > > > > > > > > > > > > of one workflow approach. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 1:57 AM Lari Hotari < > l...@hotari.net> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Sijie. > > > > > > > > > > > > > > > > > > > > > > > > > > > The "Fail fast" approach is great. That would be > super helpful if > > > > > > > > > > > there > > > > > > > > > > > > > are > > > > > > > > > > > > > > multiple workflows and each workflow is > retryable. > > > > > > > > > > > > > > However, I am not sure how much it will help if > you run all > > > > > > > > > workflows > > > > > > > > > > > > in > > > > > > > > > > > > > > one giant workflow. Or is it making things worse? > > > > > > > > > > > > > > > > > > > > > > > > > > We can reduce the need for re-running workflow > runs. The reason why > > > > > > > > > > > > > re-runs happen currently is because of high > flakiness of tests. > > > > > > > > > > > > > Addressing flakiness continues to be top-priority. > Now that the > > > > > > > > > Pulsar > > > > > > > > > > > CI > > > > > > > > > > > > > workflow prototype is finished, I'll be focusing > more in the other > > > > > > > > > > > draft > > > > > > > > > > > > > PIP, "Changes to flaky test handling". > > > > > > > > > > > > > We as a community should address the critical > problem that the > > > > > > > > > current > > > > > > > > > > > > > retry solution has: it can mask bugs in production > code and make > > > > > > > > > the > > > > > > > > > > > > build > > > > > > > > > > > > > pass and allow changes to be merged that cause > regressions. > > > > > > > > > > > > > It's a false sense of security what the green > builds after all the > > > > > > > > > > > > retries > > > > > > > > > > > > > bring us. Bringing Pulsar to the next level in > stability requires > > > > > > > > > > > > > addressing this. > > > > > > > > > > > > > > > > > > > > > > > > > > If something doesn't work, it can be adapted and > improved. Changes > > > > > > > > > can > > > > > > > > > > > be > > > > > > > > > > > > > rolled back and revisited when things go worse. We > also need a > > > > > > > > > leap of > > > > > > > > > > > > > faith. > > > > > > > > > > > > > "No pain, no gain", like any change, it will be > painful at first, > > > > > > > > > but > > > > > > > > > > > we > > > > > > > > > > > > > will get over the bump. > > > > > > > > > > > > > > > > > > > > > > > > > > > Secondly, your test has been done in your folk > where there are > > > > > > > > > not a > > > > > > > > > > > > lot > > > > > > > > > > > > > of > > > > > > > > > > > > > > concurrent pushes and pull requests. I am not > sure how your > > > > > > > > > approach > > > > > > > > > > > > will > > > > > > > > > > > > > > behave once it is merged into master. Can you > simulate multiple > > > > > > > > > > > > > concurrent > > > > > > > > > > > > > > pull requests in your account to prove your > approach doesn't > > > > > > > > > bring > > > > > > > > > > > side > > > > > > > > > > > > > > effects? > > > > > > > > > > > > > > > > > > > > > > > > > > One possibility to address this is to introduce > the new workflow > > > > > > > > > in a > > > > > > > > > > > > mode > > > > > > > > > > > > > where you need to opt-in to the new workflow in > some way. > > > > > > > > > > > > > This was an idea brought up by my colleagues > Enrico and Andrey. > > > > > > > > > > > > > It might be possible to configure the existing > workflow and this > > > > > > > > > new > > > > > > > > > > > > > workflow in a way where some condition (for > example whitelisted > > > > > > > > > github > > > > > > > > > > > > user > > > > > > > > > > > > > name or a certain keyword in the PR > title/description) chooses > > > > > > > > > either > > > > > > > > > > > one > > > > > > > > > > > > > for the Pull request. This solution would also > require disabling > > > > > > > > > > > > > required status checks ("Require status checks to > pass before > > > > > > > > > merging" > > > > > > > > > > > > > feature in GitHub branch protection rules) since > the names of the > > > > > > > > > > > checks > > > > > > > > > > > > > would be different. > > > > > > > > > > > > > > > > > > > > > > > > > > > Lastly, can we apply those optimizations to > current workflows > > > > > > > > > without > > > > > > > > > > > > > > merging them into one giant workflow? > > > > > > > > > > > > > > > > > > > > > > > > > > This is what I have been doing. All individual > optimizations have > > > > > > > > > > > already > > > > > > > > > > > > > been sent as PRs in the last months. I guess > there's 20-30 PRs that > > > > > > > > > > > have > > > > > > > > > > > > > already been merged. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/pulls?q=is%3Apr+author%3Alhotari+is%3Amerged > > > > > > > > > > > > > There's also 2 build related PRs from yesterday > which haven't been > > > > > > > > > > > merged > > > > > > > > > > > > > yet: > > > > > > > > > > > > > Fix Maven download issues (ported from the > prototype to our > > > > > > > > > existing > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > CI): https://github.com/apache/pulsar/pull/9883 > > > > > > > > > > > > > improve Maven module build order (required for > more efficient > > > > > > > > > builds > > > > > > > > > > > that > > > > > > > > > > > > > selectively build required artifacts): > > > > > > > > > > > > > https://github.com/apache/pulsar/pull/9882 > > > > > > > > > > > > > > > > > > > > > > > > > > There aren't many optimizations left that could be > ported from the > > > > > > > > > > > > > prototype to the existing workflow. There are a > few, but the > > > > > > > > > impact is > > > > > > > > > > > > > minor. > > > > > > > > > > > > > The reason for this is that the optimization with > the greatest > > > > > > > > > impact > > > > > > > > > > > are > > > > > > > > > > > > > the ones that build a binary artifacts (maven > libs, docker images) > > > > > > > > > once > > > > > > > > > > > > and > > > > > > > > > > > > > share it with the downstream jobs in the pipeline. > > > > > > > > > > > > > Applying this type of solution has certain > challenges when there > > > > > > > > > are > > > > > > > > > > > > > multiple separate workflow. Sharing binary > artifacts to other > > > > > > > > > workflows > > > > > > > > > > > > > would require that the workflow to reuse the > artifacts gets > > > > > > > > > triggered > > > > > > > > > > > by > > > > > > > > > > > > > the workflow that produced the artifacts. This > wouldn't be secure > > > > > > > > > or > > > > > > > > > > > > > practical for handling pull requests. Triggering a > workflow > > > > > > > > > explicitly > > > > > > > > > > > > > would require a token from the main repository and > using that for > > > > > > > > > pull > > > > > > > > > > > > > request builds would be a serious security > vulnerability. (more > > > > > > > > > details > > > > > > > > > > > > > about the GitHub Actions security model in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://securitylab.github.com/research/github-actions-preventing-pwn-requests > > > > > > > > > > > > > ) > > > > > > > > > > > > > > > > > > > > > > > > > > *Sijie, how far are we from getting the draft PIP > "Changes to > > > > > > > > > GitHub > > > > > > > > > > > > > Actions based Pulsar CI" into an actual PIP that > gets put on the > > > > > > > > > wiki > > > > > > > > > > > > > page https://github.com/apache/pulsar/wiki > > > > > > > > > > > > > <https://github.com/apache/pulsar/wiki> ?* > > > > > > > > > > > > > > > > > > > > > > > > > > -Lari > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 10:46 AM Sijie Guo < > guosi...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > This is good progress. However, my main concern > is still merging > > > > > > > > > all > > > > > > > > > > > > > > workflows into one giant workflow. > > > > > > > > > > > > > > > > > > > > > > > > > > > > The "Fail fast" approach is great. That would be > super helpful if > > > > > > > > > > > there > > > > > > > > > > > > > are > > > > > > > > > > > > > > multiple workflows and each workflow is > retryable. > > > > > > > > > > > > > > However, I am not sure how much it will help if > you run all > > > > > > > > > workflows > > > > > > > > > > > > in > > > > > > > > > > > > > > one giant workflow. Or is it making things worse? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Secondly, your test has been done in your folk > where there are > > > > > > > > > not a > > > > > > > > > > > > lot > > > > > > > > > > > > > of > > > > > > > > > > > > > > concurrent pushes and pull requests. I am not > sure how your > > > > > > > > > approach > > > > > > > > > > > > will > > > > > > > > > > > > > > behave once it is merged into master. Can you > simulate multiple > > > > > > > > > > > > > concurrent > > > > > > > > > > > > > > pull requests in your account to prove your > approach doesn't > > > > > > > > > bring > > > > > > > > > > > side > > > > > > > > > > > > > > effects? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Lastly, can we apply those optimizations to > current workflows > > > > > > > > > without > > > > > > > > > > > > > > merging them into one giant workflow? > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Sijie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 12:30 AM Lari Hotari < > l...@hotari.net> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback Michael. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I left a question on the doc about how > concurrent runs > > > > > > > > > affect the > > > > > > > > > > > > > > > > repository's 5 GB cache limit. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is a good question. There isn't a clear > answer in the > > > > > > > > > GitHub > > > > > > > > > > > > > Actions > > > > > > > > > > > > > > > Cache documentation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The documentation is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows > > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > Based on this document and some testing, I > have made these > > > > > > > > > > > > conclusions: > > > > > > > > > > > > > > > For GitHub Actions Cache, pull requests get > executed in the > > > > > > > > > context > > > > > > > > > > > > of > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > forked repository. > > > > > > > > > > > > > > > The workflow triggered by a pull request event > can only update > > > > > > > > > it's > > > > > > > > > > > > own > > > > > > > > > > > > > > > cache. It has read-only access to upstream > caches. > > > > > > > > > > > > > > > If there's a cache miss, the entry will get > written to the > > > > > > > > > cache of > > > > > > > > > > > > the > > > > > > > > > > > > > > > forked repository. If the PR could to write to > the upstream > > > > > > > > > cache, > > > > > > > > > > > it > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > be a security issue since this would be > vulnerable to cache > > > > > > > > > > > poisoning > > > > > > > > > > > > > > > attacks. Each repository has a 5GB quota for > writes. The > > > > > > > > > entries > > > > > > > > > > > are > > > > > > > > > > > > > kept > > > > > > > > > > > > > > > up to 7 days. > > > > > > > > > > > > > > > The performance is fairly good. Loading docker > images from the > > > > > > > > > > > > > repository > > > > > > > > > > > > > > > happens about 15MB/s. Writing is 2-3x slower, > about 5-7MB/s. > > > > > > > > > (the > > > > > > > > > > > > > > > performance of the GHA cache is most likely > higher since this > > > > > > > > > is > > > > > > > > > > > the > > > > > > > > > > > > > > > throughput for docker load / docker save) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If a single repository has a lot of concurrent > jobs, it could > > > > > > > > > start > > > > > > > > > > > > > > > evicting caches. > > > > > > > > > > > > > > > However that isn't likely to happen with the > way Pulsar is > > > > > > > > > > > developed > > > > > > > > > > > > > > since > > > > > > > > > > > > > > > pull requests are created from personal forks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I also think it could be helpful to > explicitly document, or > > > > > > > > > > > > reference > > > > > > > > > > > > > > > > github documentation, on how failure will > affect the DAG. I'm > > > > > > > > > > > > > assuming > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > if an action fails, its parallel peer > actions will run to > > > > > > > > > > > > completion, > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > that the rest of the remaining stages will > get canceled, but > > > > > > > > > I > > > > > > > > > > > > > haven't > > > > > > > > > > > > > > > > worked with github actions before. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For matrix jobs, "fail fast" is the default, > which cancels all > > > > > > > > > jobs > > > > > > > > > > > > in > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > matrix if one fails. Other parallel flows > would run to > > > > > > > > > completion > > > > > > > > > > > by > > > > > > > > > > > > > > > default. > > > > > > > > > > > > > > > In the prototype, I have added a Github script > step to each > > > > > > > > > job to > > > > > > > > > > > > > cancel > > > > > > > > > > > > > > > the complete workflow when a failure occurs. > > > > > > > > > > > > > > > Here's an example: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/lhotari/pulsar/blob/lh-refactor-pulsar-ci-with-retries/.github/workflows/pulsar-ci.yaml#L281-L289 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The prototype follows a "fail fast" design. > When a failure > > > > > > > > > occurs, > > > > > > > > > > > > fail > > > > > > > > > > > > > > > fast and don't continue with other jobs. > > > > > > > > > > > > > > > The benefit of this is that it reduces > resource consumption. > > > > > > > > > This > > > > > > > > > > > > helps > > > > > > > > > > > > > > > keep the build queue short. > > > > > > > > > > > > > > > When the build queue is short, developers get > quick feedback > > > > > > > > > from > > > > > > > > > > > CI. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Documenting all details in the PIP document > isn't practical. > > > > > > > > > > > > > > > *I'm hoping to start a separate document on > low level details > > > > > > > > > when > > > > > > > > > > > > > there > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > a high level acceptance of the proposed > "Changes to GitHub > > > > > > > > > Actions > > > > > > > > > > > > > based > > > > > > > > > > > > > > > Pulsar CI".* > > > > > > > > > > > > > > > Together we can make this happen. We need > decisions too. This > > > > > > > > > > > > proposal > > > > > > > > > > > > > > > cannot stay as a draft forever. > > > > > > > > > > > > > > > I'm looking forward to hearing from the Pulsar > community, > > > > > > > > > Pulsar > > > > > > > > > > > > > > > committer and Pulsar PMC members how to take > this forward. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > BR, > > > > > > > > > > > > > > > -Lari > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 8:06 AM Michael > Marshall < > > > > > > > > > > > > > mikemars...@gmail.com> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This will be a great improvement. I read > through the PIP, and > > > > > > > > > > > > > overall, > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > looks good to me. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I left a question on the doc about how > concurrent runs > > > > > > > > > affect the > > > > > > > > > > > > > > > > repository's 5 GB cache limit. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I also think it could be helpful to > explicitly document, or > > > > > > > > > > > > reference > > > > > > > > > > > > > > > > github documentation, on how failure will > affect the DAG. I'm > > > > > > > > > > > > > assuming > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > if an action fails, its parallel peer > actions will run to > > > > > > > > > > > > completion, > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > that the rest of the remaining stages will > get canceled, but > > > > > > > > > I > > > > > > > > > > > > > haven't > > > > > > > > > > > > > > > > worked with github actions before. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for all of the work you've put in so > far. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 11, 2021 at 6:37 PM Yuva raj < > uvar...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is great news. Thanks Hari , Mateo > and pulsar > > > > > > > > > community > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021, 2:04 AM Lari Hotari < > > > > > > > > > > > lari.hot...@sagire.fi > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dear Pulsar community members, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The work on "Changes to GitHub Actions > based Pulsar CI" > > > > > > > > > has > > > > > > > > > > > > gone > > > > > > > > > > > > > > > > forward > > > > > > > > > > > > > > > > > > based on your feedback. Here are some > updates about the > > > > > > > > > work. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The draft PIP proposal document is here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry > > > > > > > > > > > > > > > > > > There's a *detailed status update in the > document about a > > > > > > > > > > > > > prototype > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > refactored Pulsar CI GitHub Actions > based workflow*. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for all the suggestions and > feedback by now. A > > > > > > > > > lot of > > > > > > > > > > > > > > > > improvements > > > > > > > > > > > > > > > > > > have been made by the Pulsar > contributors to overcome the > > > > > > > > > > > > > technical > > > > > > > > > > > > > > > > > > obstacles. > > > > > > > > > > > > > > > > > > Special thanks go to Matteo for reducing > the sizes of > > > > > > > > > docker > > > > > > > > > > > > > > images. > > > > > > > > > > > > > > > A > > > > > > > > > > > > > > > > > lot > > > > > > > > > > > > > > > > > > of small improvements have been made to > the Pulsar maven > > > > > > > > > > > build > > > > > > > > > > > > to > > > > > > > > > > > > > > > > enable > > > > > > > > > > > > > > > > > > the new refactored GitHub Actions > workflow. Thank you > > > > > > > > > for all > > > > > > > > > > > > PR > > > > > > > > > > > > > > > > reviews > > > > > > > > > > > > > > > > > > and feedback. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The main goal of the "Changes to GitHub > Actions based > > > > > > > > > Pulsar > > > > > > > > > > > > CI" > > > > > > > > > > > > > > work > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > been to *reduce the resource consumption > of the Pulsar CI > > > > > > > > > > > build > > > > > > > > > > > > > and > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > speed up Pulsar development by improving > the developer > > > > > > > > > > > > > > productivity* > > > > > > > > > > > > > > > > when > > > > > > > > > > > > > > > > > > less time is wasted in waiting for > Pulsar CI build > > > > > > > > > feedback. > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > prototype > > > > > > > > > > > > > > > > > > demonstrates these improvements. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As you can see from the email from Jan > 28 below, *the > > > > > > > > > > > resource > > > > > > > > > > > > > > > > > consumption > > > > > > > > > > > > > > > > > > was 19 hrs 36 minutes* for a single pull > request that was > > > > > > > > > > > > > observed > > > > > > > > > > > > > > > when > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > work began. > > > > > > > > > > > > > > > > > > Now, with the prototype of the > refactored Pulsar CI > > > > > > > > > build, > > > > > > > > > > > the > > > > > > > > > > > > > > > resource > > > > > > > > > > > > > > > > > > consumption is *7 hrs 9 minutes.* > > > > > > > > > > > > > > > > > > *This is about 60% reduction in resource > consumption.* > > > > > > > > > The > > > > > > > > > > > > whole > > > > > > > > > > > > > > > > pipeline > > > > > > > > > > > > > > > > > > completes in 75-100 minutes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here's a breakdown of the duration > (resource > > > > > > > > > consumption) of > > > > > > > > > > > > each > > > > > > > > > > > > > > > build > > > > > > > > > > > > > > > > > job > > > > > > > > > > > > > > > > > > in the refactored workflow: > > > > > > > > > > > > > > > > > > Workflow Job seconds h:mm:ss > > > > > > > > > > > > > > > > > > Pulsar CI Changed files check 4 0:00:04 > > > > > > > > > > > > > > > > > > Pulsar CI Go 1.11 Functions 155 0:02:35 > > > > > > > > > > > > > > > > > > Pulsar CI Go 1.12 Functions 166 0:02:46 > > > > > > > > > > > > > > > > > > Pulsar CI Go 1.13 Functions 113 0:01:53 > > > > > > > > > > > > > > > > > > Pulsar CI Go 1.14 Functions 96 0:01:36 > > > > > > > > > > > > > > > > > > Pulsar CI Build on MacOS 1017 0:16:57 > > > > > > > > > > > > > > > > > > Pulsar CI Build and License check 346 > 0:05:46 > > > > > > > > > > > > > > > > > > Pulsar CI Build Pulsar CPP and Python > clients 683 0:11:23 > > > > > > > > > > > > > > > > > > Pulsar CI Build Pulsar java-test-image > docker image 405 > > > > > > > > > > > 0:06:45 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Other 1580 0:26:20 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Broker > Group 1 968 > > > > > > > > > 0:16:08 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Broker > Group 2 2223 > > > > > > > > > 0:37:03 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Client > Api 1652 0:27:32 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Client > Impl 916 0:15:16 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Brokers - Other > 522 0:08:42 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Unit - Proxy 331 0:05:31 > > > > > > > > > > > > > > > > > > Pulsar CI Build Pulsar docker image 2343 > 0:39:03 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Shade 414 > 0:06:54 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Backwards > Compatibility 849 > > > > > > > > > > > > 0:14:09 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Cli 1490 > 0:24:50 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Messaging > 857 0:14:17 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Schema 468 > 0:07:48 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Standalone > 286 0:04:46 > > > > > > > > > > > > > > > > > > Pulsar CI CI - Integration - Transaction > 362 0:06:02 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Function State > 699 0:11:39 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Tiered > FileSystem 779 0:12:59 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Tiered JCloud > 529 0:08:49 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Pulsar > Connectors - Thread 1795 > > > > > > > > > > > 0:29:55 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Pulsar > Connectors - Process 2312 > > > > > > > > > > > > 0:38:32 > > > > > > > > > > > > > > > > > > Pulsar CI CI - System - Sql 1377 0:22:57 > > > > > > > > > > > > > > > > > > *Total resource consumption* > > > > > > > > > > > > > > > > > > 7:08:57 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > GitHub Actions doesn't support > restarting a single job ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.community/t/ability-to-rerun-just-a-single-job-in-a-workflow/17234 > > > > > > > > > > > > > > > > > > ). > > > > > > > > > > > > > > > > > > However, this is not a showstopper since > there are ways > > > > > > > > > to > > > > > > > > > > > > > address > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > issues that cause flakiness. > > > > > > > > > > > > > > > > > > There is a separate PIP for changing the > way to handle > > > > > > > > > flaky > > > > > > > > > > > > > tests. > > > > > > > > > > > > > > > You > > > > > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > > > find the link to that in the "Changes to > GitHub Actions > > > > > > > > > based > > > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > > > > CI" > > > > > > > > > > > > > > > > > > document's header. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Some requests for the Pulsar community:* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1) *Please take a look at the updated > PIP document*: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry > > > > > > > > > > > > > > > > > > . *It also contains more details of the > prototype that > > > > > > > > > has > > > > > > > > > > > been > > > > > > > > > > > > > > > > > > successfully completed.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2) *Please share your feedback and > suggest a way > > > > > > > > > forward.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Thank you for your help!* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > BR, Lari > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 28, 2021 at 7:13 PM Lari > Hotari < > > > > > > > > > > > > > lari.hot...@sagire.fi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dear Pulsar community members, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently, the Pulsar GitHub Actions > workflows are > > > > > > > > > > > consuming > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > majority > > > > > > > > > > > > > > > > > > > of the shared pool of resources > allocated for > > > > > > > > > > > > > github.com/apache > > > > > > > > > > > > > > > > > > projects. > > > > > > > > > > > > > > > > > > > Other Apache projects have been > impacted and there is a > > > > > > > > > > > > demand > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > the Pulsar CI > > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/pull/9159#issuecomment-766915396 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > asap. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In GitHub Actions Runners, the unit of > resources is the > > > > > > > > > > > time > > > > > > > > > > > > > > that a > > > > > > > > > > > > > > > > > > Runner > > > > > > > > > > > > > > > > > > > is occupied. I observed the workflow > runs for handling > > > > > > > > > a > > > > > > > > > > > > single > > > > > > > > > > > > > > > Pull > > > > > > > > > > > > > > > > > > > Request (in my personal fork) and > these were the > > > > > > > > > running > > > > > > > > > > > > > > durations: > > > > > > > > > > > > > > > > > > > Workflow name Duration > > > > > > > > > > > > > > > > > > > CI - Build - MacOS 0:17:23 > > > > > > > > > > > > > > > > > > > CI - Go Functions style check 0:02:38 > > > > > > > > > > > > > > > > > > > CI - Unit - Brokers - Other 0:15:40 > > > > > > > > > > > > > > > > > > > CI - Unit - Brokers - Client Impl > 0:16:28 > > > > > > > > > > > > > > > > > > > CI - Misc 0:16:51 > > > > > > > > > > > > > > > > > > > CI - Unit - Proxy 0:14:23 > > > > > > > > > > > > > > > > > > > CI - Go Functions Tests 0:22:08 > > > > > > > > > > > > > > > > > > > CI - CPP, Python Tests 0:23:30 > > > > > > > > > > > > > > > > > > > CI - Unit 0:42:11 > > > > > > > > > > > > > > > > > > > CI - Integration - Sql 1:00:13 > > > > > > > > > > > > > > > > > > > CI - Integration - Tiered JCloud > 1:00:18 > > > > > > > > > > > > > > > > > > > CI - Integration - Tiered FileSystem > 1:00:13 > > > > > > > > > > > > > > > > > > > CI - Integration - Function State > 1:00:12 > > > > > > > > > > > > > > > > > > > CI - Integration - Cli 1:10:22 > > > > > > > > > > > > > > > > > > > CI - Integration - Transaction 1:16:34 > > > > > > > > > > > > > > > > > > > CI - Integration - Process 1:11:23 > > > > > > > > > > > > > > > > > > > CI - Shade - Test 1:15:45 > > > > > > > > > > > > > > > > > > > CI - Unit - Brokers - Client Api > 0:26:13 > > > > > > > > > > > > > > > > > > > CI - Unit - Brokers - Broker Group 2 > 0:35:05 > > > > > > > > > > > > > > > > > > > CI - Integration - Standalone 0:45:29 > > > > > > > > > > > > > > > > > > > CI - Integration - Messaging 1:00:23 > > > > > > > > > > > > > > > > > > > CI - Integration - Thread 1:00:19 > > > > > > > > > > > > > > > > > > > CI - Integration - Backwards > Compatibility 1:00:19 > > > > > > > > > > > > > > > > > > > CI - Integration - Schema 1:00:19 > > > > > > > > > > > > > > > > > > > CI - Unit - Brokers - Broker Group 1 > 2:02:31 > > > > > > > > > > > > > > > > > > > TOTAL 19:36:50 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *In this case, the total resource > consumption of GitHub > > > > > > > > > > > > Actions > > > > > > > > > > > > > > > > Runners > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > 19 hours 36 minutes 50 seconds for a > single pull > > > > > > > > > request to > > > > > > > > > > > > > > > > > > apache/pulsar.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Since GitHub Actions Runner resource > pool utilization > > > > > > > > > is > > > > > > > > > > > very > > > > > > > > > > > > > > high, > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > leads to the build queue to grow and > take a long time > > > > > > > > > to > > > > > > > > > > > > > process. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have been looking for ways to > improve the Pulsar CI > > > > > > > > > for > > > > > > > > > > > the > > > > > > > > > > > > > > last > > > > > > > > > > > > > > > 3 > > > > > > > > > > > > > > > > > > > months. During this period I worked on > a few > > > > > > > > > experiments. > > > > > > > > > > > The > > > > > > > > > > > > > > > > learnings > > > > > > > > > > > > > > > > > > > from the past experiments are > documented at a high > > > > > > > > > level in > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > following > > > > > > > > > > > > > > > > > > > draft PIP document. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *The draft PIP "Changes to GitHub > Actions based Pulsar > > > > > > > > > CI" > > > > > > > > > > > > > > document > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > Google doc:* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Please participate* so that we get > the plan adjusted > > > > > > > > > based > > > > > > > > > > > > on > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > feedback asap. If there's already a > similar effort > > > > > > > > > > > ongoing, I > > > > > > > > > > > > > > hope > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > > > > join efforts. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Let's fix Pulsar CI!* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > BR, Lari > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >