> In short, IIUC, each contributor should: > 1. Follow https://pulsar.apache.org/contributing/#ci-testing-in-your-fork to > 2. Paste the link of the same PR in contributor’s fork to the PR in Apache > repo > > Then a committer should run `/pulsarbot ready-to-test` after the PR in > contributor's private repo passed all tests, right?
Exactly. One small detail: It should be the PR author's responsibility to follow up and request for a review and an approval after the tests pass. If there are later changes in the PR after the "ready-to-test" label has been added, we could simply let the Pulsar CI handle the builds. -Lari On 2022/09/15 10:11:53 Yunze Xu wrote: > Hi Lari, > > This proposal LGTM. But I have some questions about the details. > > In short, IIUC, each contributor should: > 1. Follow https://pulsar.apache.org/contributing/#ci-testing-in-your-fork to > 2. Paste the link of the same PR in contributor’s fork to the PR in Apache > repo > > Then a committer should run `/pulsarbot ready-to-test` after the PR in > contributor's private repo passed all tests, right? > > Thanks, > Yunze > > > > > > On Sep 15, 2022, at 17:54, Lari Hotari <lhot...@apache.org> wrote: > > > > Thanks for the comment. > > > > The question isn't about trusting PRs. > > The CI resource consumption problem is also caused by current committer > > PRs. That's why it > > is necessary to handle all PRs in the same way. > > The benefit of the proposed solution is that we could decide to run some > > light checks automatically before the step that requires the > > "ready-to-check" label. > > > > After I sent the proposal, I found out that the Pulsar committer > > information including the GitHub > > user names is available in JSON format at > > https://whimsy.apache.org/roster/committee/pulsar.json . > > Pulsarbot can use this information for authorizing users who have access to > > "/pulsarbot ready-to-test". > > > > I agree that we can skip adding a separate reviewer role, let's > > simply use https://whimsy.apache.org/roster/committee/pulsar.json as the > > source of truth > > for authorization. > > > > -Lari > > > > On 2022/09/15 09:22:18 tison wrote: > >> Hi Lari, > >> > >> Thanks for starting this discussion. The overall proposal looks good and > >> it's really great that you can spend some time on such a significant > >> infrastructure. > >> > >> One comment here is that we can start with all "authorized" users to > >> trigger the CI in the committer group instead of introducing a new concept > >> "reviewer" - it will be another topic to discuss and I generally prefer > >> more committership to encourage participation instead of a complicated > >> membership structure. > >> > >> Besides, a quick fixup for reducing traffic is setting "Fork pull request > >> workflows from outside collaborators" option[1] as "Require approval for > >> all outside collaborators". This is provided out-of-the-box by GitHub and > >> requires NO development[2]. Although it doesn't restrict people who are > >> already apache org members but are not Pulsar committers, I believe the > >> trust level is acceptable. An INFRA team member will be asked to perform > >> the settings change if we want this. > >> > >> Best, > >> tison. > >> > >> [1] https://github.com/apache/pulsar/settings/actions > >> [2] > >> https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks > >> > >> > >> Lari Hotari <lhot...@apache.org> 于2022年9月15日周四 16:36写道: > >> > >>> Hi all, > >>> > >>> The GitHub Actions based Pulsar CI has been experiencing issues for > >>> multiple weeks. The condition is currently better, but the resource > >>> shortage issue remains. CI builds will take a long time to complete even > >>> after many optimizations have been made. > >>> > >>> There's a long email thread with some details about the past issues: > >>> https://lists.apache.org/thread/p7rb04vf1mt0kk3v2r7xl9dvb3zkhtxf > >>> > >>> I have filed an issue to GitHub support about the CI issues over a week > >>> ago, and I finally received an answer a few hours ago. However the > >>> GitHub support person didn't reply to my questions at all, but instead > >>> suggested that there's a beta program where it's possible to pay for > >>> more resources. That solution isn't suitable for our case, since it > >>> doesn't seem to be possible to assign GitHub Actions Runner VM resources > >>> only for a specific Apache project. I'll follow up with GitHub support, > >>> but > >>> I don't expect that to resolve our problems in the near term. We need > >>> to make changes in our CI resource consumption. > >>> > >>> In a the-asf Slack thread [1] about Pulsar CI issues, Martin Grigorov > >>> suggested: "Apache Spark project requires that all PRs are executed in > >>> the contributor's GHA quota. Maybe Pulsar can do the same ?!" > >>> > >>> The Apache Spark contributing guide contains details about this in the > >>> "Pull request" section, https://spark.apache.org/contributing.html . > >>> > >>> "Before creating a pull request in Apache Spark, it is important to > >>> check if tests can pass on your branch because our GitHub Actions > >>> workflows automatically run tests for your pull request/following > >>> commits and every run burdens the limited resources of GitHub Actions in > >>> Apache Spark repository. " > >>> > >>> In Pulsar, we will need to do the same. As a solution to this, Tison > >>> suggested that we would not run all tests for the PR unless there's a > >>> "ready-to-test" label on the PR. > >>> > >>> I think this is a good suggestion. We could extend the existing > >>> "pulsarbot" to help with the automation. > >>> > >>> A reviewer could comment "/pulsarbot ready-to-test" on the PR and > >>> pulsarbot would add the label and also restart the CI workflow to make > >>> it proceed and run the tests. > >>> pulsarbot would check for authorized users. One simple > >>> approach would be to add a file ".pulsarci.yaml" in apache/pulsar > >>> repository with the relevant information: > >>> > >>> committer_github_ids: > >>> - committer1 > >>> - committer2 > >>> ... > >>> > >>> ready_to_test: > >>> authorized_github_ids: > >>> - userid1 > >>> - userid2 > >>> ... > >>> > >>> We would have a script to synchronize all Pulsar committers to this file > >>> peridiotically (manual step after there's a new committer). ASF provides > >>> public json files for project members at > >>> https://whimsy.apache.org/public/public_ldap_projects.json , however the > >>> mapping to github user names seems to be missing. That could be done > >>> with a custom script since ASF LDAP contains the github username. > >>> > >>> All Pulsar committers would have access. In addition, there could be other > >>> users that are authorized for using "/pulsarbot ready-to-test". > >>> > >>> This solution would also require changes in the GitHub Actions workflows > >>> so that the workflow is failed in an early step unless there's a > >>> ready-to-test label for the PR. > >>> > >>> With the above solution, we would be able to cut the amount of > >>> unnecessary builds and get the excessive resource consumption issue > >>> under control. The PR authors would be instructed to run initial PR > >>> builds in their own fork and the reviewer should check that this is done > >>> before approving the PR for testing with "/pulsarbot ready-to-test". > >>> > >>> I would suggest proceeding quickly on this matter without separate PIPs > >>> or votes. We could follow the Apache lazy consensus > >>> (https://community.apache.org/committers/lazyConsensus.html) principle > >>> and make this happen if there aren't objections in the next 72 hours. > >>> The improvement suggestions to this proposal would obviously be taken > >>> into account and if someone objects, we wouldn't have reached lazy > >>> consensus and we wouldn't proceed. > >>> > >>> -Lari > >>> > >>> > >>> 1 - > >>> https://the-asf.slack.com/archives/CBX4TSBQ8/p1661849820238809?thread_ts=1661512133.913279&cid=CBX4TSBQ8 > >>> > >> > >