Thanks for driving this Mathhias! +1 for joining the INFRA trial.
> Apache Infra did some experimenting on self-hosted runners in collaboration > with Apache Airflow (see ashb/runner with releases/pr-security-options branch) > where they only allow certain groups of users (e.g. committers) to run their > workflows on self-hosted machines. Any other group would have to rely on > GitHub’s runners. This is mentioned in the "Limitations of GitHub Actions in the past" section of the FLIP. Does this also apply to the Apache INFRA setup or can we expect contributors' runs executed there too? As you mentioned in the FLIP, there are some timeout-related test discrepancies between different setups. Similar discrepancies could manifest themselves between the Github runners and the Apache INFRA runners. It would be great if we should have a uniform setup, where if tests pass in the individual CI, they also pass in the main runner and vice versa. Currently we have such memory limits-related issues in individual vs main Azure CI pipelines. >2. Disable Flink’s CI bot for PRs if step #1 is considered successful >3. Join trial program for ephemeral GHA runners Due to potential new kinds of instabilities manifesting themselves in the new setup, can we keep both CIs running in parallel and keep relying on the existing one until we are confident in the tests stability on the new ephemeral GHA infra (skip 2.)? Best, Alex On Wed, 29 Nov 2023 at 13:42, Xintong Song <tonysong...@gmail.com> wrote: > Thanks for the efforts, Matthias. > > > I think it would be helpful if we can at the end migrate the CI to an > ASF-managed Github Action, as long as it provides us a similar computation > capacity and stability. Given that the proposal is only to start a trial > and investigate whether the migration is feasible, I don't see much concern > in this. > > > I have only one suggestion and one question. > > - Regarding the migration plan, I wonder if we should not disable the CI > bot until we fully decide to migrate to Github Actions? In case the nightly > runs don't really work well, it might be debatable whether we should > maintain the CI in two places (i.e. PRs on Github Actions and cron builds > on Azure). > > - What exactly are the changes that would affect contributors during the > trial period? Is it only an additional CI report that you can potentially > just ignore? Or would there be some larger impacts, e.g. you cannot merge a > PR if the Github Action CI is not passed (I don't know, I just made this > up)? > > > Best, > > Xintong > > > > On Wed, Nov 29, 2023 at 8:07 PM Yuxin Tan <tanyuxinw...@gmail.com> wrote: > > > Ok, Thanks for the update and the explanations. > > > > Best, > > Yuxin > > > > > > Matthias Pohl <matthias.p...@aiven.io.invalid> 于2023年11月29日周三 15:43写道: > > > > > > > > > > According to the Flip, the new tests will support arm env. > > > > I believe that's good news for arm users. I have a minor > > > > question here. Will it be a blocker before migrating the new > > > > tests? If not, If not, when can we expect arm environment > > > > support to be implemented? Thanks. > > > > > > > > > Thanks for your feedback, everyone. > > > > > > About the ARM support. I want to underline that this FLIP is not about > > > migrating to GitHub Actions but to start a trial run in the Apache > Flink > > > repository. That would allow us to come up with a proper decision > whether > > > GitHub Actions is what we want. I admit that the title is a bit > > > "clickbaity". I updated the FLIP's title and its Motivation to make > > things > > > clear. > > > > > > The FLIP suggests starting a trial period until 1.19 is released to try > > > things out. A proper decision on whether we want to migrate would be > made > > > at the end of the 1.19 release cycle. > > > > > > About the ARM support: This related content of the FLIP is entirely > based > > > on documentation from Apache INFRAs side. INFRA seems to offer this ARM > > > support for their ephemeral runners. The ephemeral runners are in the > > > testing stage, i.e. these runners are still experimental. Apache INFRA > > asks > > > Apache projects to join this test. > > > > > > Whether the ARM support is actually possible to achieve within Flink is > > > something we have to figure out as part of the trial run. One > conclusion > > of > > > the trial run could be that we still move ahead with GHA but don't use > > arm > > > machines due to some blocking issues. > > > > > > Matthias > > > > > > > > > > > > On Wed, Nov 29, 2023 at 4:46 AM Yuxin Tan <tanyuxinw...@gmail.com> > > wrote: > > > > > > > Hi, Matthias, > > > > > > > > Thanks for driving this. > > > > +1 from my side. > > > > > > > > According to the Flip, the new tests will support arm env. > > > > I believe that's good news for arm users. I have a minor > > > > question here. Will it be a blocker before migrating the new > > > > tests? If not, If not, when can we expect arm environment > > > > support to be implemented? Thanks. > > > > > > > > Best, > > > > Yuxin > > > > > > > > > > > > Márton Balassi <balassi.mar...@gmail.com> 于2023年11月29日周三 03:09写道: > > > > > > > > > Thanks, Matthias. Big +1 from me. > > > > > > > > > > On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl > > > > > <matthias.p...@aiven.io.invalid> wrote: > > > > > > > > > > > Thanks for the pointer. I'm planning to join that meeting. > > > > > > > > > > > > On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot < > > > echauc...@apache.org > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > FYI there is the ASF infra roundtable soon. One of the subjects > > for > > > > > this > > > > > > > session is GitHub Actions. It could be worth passing by: > > > > > > > > > > > > > > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on > > Slack. > > > > > > > > > > > > > > For information about theroundtables, and about how to join, > > > > > > > see:https://infra.apache.org/roundtable.html > > > > > > > <https://infra.apache.org/roundtable.html> > > > > > > > > > > > > > > Best > > > > > > > > > > > > > > Etienne > > > > > > > > > > > > > > Le 24/11/2023 à 14:16, Maximilian Michels a écrit : > > > > > > > > Thanks for reviving the efforts here Matthias! +1 for the > > > > transition > > > > > > > > to GitHub Actions. > > > > > > > > > > > > > > > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely > > > > > > > > feature-rich. Not sure about the spare capacity though. I > know > > > that > > > > > > > > for Apache Beam, Google donated a bunch of servers to get > > > > additional > > > > > > > > build capacity. > > > > > > > > > > > > > > > > -Max > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl > > > > > > > > <matthias.p...@aiven.io.invalid> wrote: > > > > > > > >> Btw. even though we've been focusing on GitHub Actions with > > this > > > > > FLIP, > > > > > > > I'm > > > > > > > >> curious whether somebody has experience with Apache Infra's > > > > Jenkins > > > > > > > >> deployment. The discussion I found about Jenkins [1] is > quite > > > > > > out-dated > > > > > > > >> (2014). I haven't worked with it myself but could imagine > that > > > > there > > > > > > are > > > > > > > >> some features provided through plugins which are missing in > > > GitHub > > > > > > > Actions. > > > > > > > >> > > > > > > > >> [1] > > > > https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4 > > > > > > > >> > > > > > > > >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl< > > > > > matthias.p...@aiven.io> > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >>> That's a valid point. I updated the FLIP accordingly: > > > > > > > >>> > > > > > > > >>>> Currently, the secrets (e.g. for S3 access tokens) are > > > > maintained > > > > > by > > > > > > > >>>> certain PMC members with access to the corresponding > > > > configuration > > > > > > in > > > > > > > the > > > > > > > >>>> Azure CI project. This responsibility will be moved to > > Apache > > > > > Infra. > > > > > > > They > > > > > > > >>>> are in charge of handling secrets in the Apache > > organization. > > > > As a > > > > > > > >>>> consequence, updating secrets is becoming a bit more > > > > complicated. > > > > > > > This can > > > > > > > >>>> be still considered an improvement from a legal standpoint > > > > because > > > > > > the > > > > > > > >>>> responsibility is transferred from an individual company > > (i.e. > > > > > > > Ververica > > > > > > > >>>> who's the maintainer of the Azure CI project) to the > Apache > > > > > > > Foundation. > > > > > > > >>> > > > > > > > >>> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser< > > > > > > > martijnvis...@apache.org> > > > > > > > >>> wrote: > > > > > > > >>> > > > > > > > >>>> Hi Matthias, > > > > > > > >>>> > > > > > > > >>>> Thanks for the write-up and for the efforts on this. I > > really > > > > hope > > > > > > > >>>> that we can move away from Azure towards GHA for a better > > > > > > integration > > > > > > > >>>> as well (directly seeing if a PR can be merged due to CI > > > passing > > > > > for > > > > > > > >>>> example). > > > > > > > >>>> > > > > > > > >>>> The one thing I'm missing in the FLIP is how we would > setup > > > the > > > > > > > >>>> secrets for the nightly runs (for the S3 tests, potential > > > tests > > > > > with > > > > > > > >>>> external services etc). My guess is we need to provide the > > > > secret > > > > > to > > > > > > > >>>> ASF Infra and then we would be able to refer to them in a > > > > > pipeline? > > > > > > > >>>> > > > > > > > >>>> Best regards, > > > > > > > >>>> > > > > > > > >>>> Martijn > > > > > > > >>>> > > > > > > > >>>> On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl > > > > > > > >>>> <matthias.p...@aiven.io.invalid> wrote: > > > > > > > >>>>> I realized that I mixed up FLIP IDs. FLIP-395 is already > > > > reserved > > > > > > > [1]. I > > > > > > > >>>>> switched to FLIP-396 [2] for the sake of consistency. 8) > > > > > > > >>>>> > > > > > > > >>>>> [1] > > > > > > https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv > > > > > > > >>>>> [2] > > > > > > > >>>>> > > > > > > > >>>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions > > > > > > > >>>>> On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl< > > > > > > matthias.p...@aiven.io > > > > > > > > > > > > > > > >>>>> wrote: > > > > > > > >>>>> > > > > > > > >>>>>> Hi everyone, > > > > > > > >>>>>> > > > > > > > >>>>>> The Flink community discussed migrating from Azure CI to > > > > GitHub > > > > > > > >>>> Actions > > > > > > > >>>>>> quite some time ago [1]. The efforts around that stalled > > due > > > > to > > > > > > > >>>> limitations > > > > > > > >>>>>> around self-hosted runner support from Apache Infra’s > > side. > > > > > There > > > > > > > >>>> were some > > > > > > > >>>>>> recent developments on that topic. Apache Infra is > > > > experimenting > > > > > > > with > > > > > > > >>>>>> ephemeral runners now which might enable us to move > ahead > > > with > > > > > > > GitHub > > > > > > > >>>>>> Actions. > > > > > > > >>>>>> > > > > > > > >>>>>> The goal is to join the trial phase for ephemeral > runners > > > and > > > > > > > >>>> experiment > > > > > > > >>>>>> with our CI workflows in terms of stability and > > performance. > > > > At > > > > > > the > > > > > > > >>>> end we > > > > > > > >>>>>> can decide whether we want to abandon Azure CI and move > to > > > > > GitHub > > > > > > > >>>> Actions > > > > > > > >>>>>> or stick to the former one. > > > > > > > >>>>>> > > > > > > > >>>>>> Nico Weidner and Chesnay laid the groundwork on this > topic > > > in > > > > > the > > > > > > > >>>> past. I > > > > > > > >>>>>> picked up the work they did and continued experimenting > > with > > > > it > > > > > in > > > > > > > my > > > > > > > >>>> own > > > > > > > >>>>>> fork XComp/flink [2] the past few weeks. The workflows > are > > > in > > > > a > > > > > > > state > > > > > > > >>>> where > > > > > > > >>>>>> I think that we start moving the relevant code into > > Flink’s > > > > > > > >>>> repository. > > > > > > > >>>>>> Example runs for the basic workflow [3] and the extended > > > > > (nightly) > > > > > > > >>>> workflow > > > > > > > >>>>>> [4] are provided. > > > > > > > >>>>>> > > > > > > > >>>>>> This will bring a few more changes to the Flink > > > contributors. > > > > > That > > > > > > > is > > > > > > > >>>> why > > > > > > > >>>>>> I wanted to bring this discussion to the mailing list > > > first. I > > > > > > did a > > > > > > > >>>> write > > > > > > > >>>>>> up on (hopefully) all related topics in FLIP-395 [5]. > > > > > > > >>>>>> > > > > > > > >>>>>> I’m looking forward to your feedback. > > > > > > > >>>>>> > > > > > > > >>>>>> Matthias > > > > > > > >>>>>> > > > > > > > >>>>>> [1] > > > > > > https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k > > > > > > > >>>>>> > > > > > > > >>>>>> [2]https://github.com/XComp/flink/actions > > > > > > > >>>>>> > > > > > > > >>>>>> [3] > https://github.com/XComp/flink/actions/runs/6926309782 > > > > > > > >>>>>> > > > > > > > >>>>>> [4] > https://github.com/XComp/flink/actions/runs/6927443941 > > > > > > > >>>>>> > > > > > > > >>>>>> [5] > > > > > > > >>>>>> > > > > > > > >>>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions > > > > > > > >>>>>> > > > > > > > >>>>>> -- > > > > > > > >>>>>> > > > > > > > >>>>>> [image: Aiven]<https://www.aiven.io> > > > > > > > >>>>>> > > > > > > > >>>>>> *Matthias Pohl* > > > > > > > >>>>>> Opensource Software Engineer, *Aiven* > > > > > > > >>>>>> matthias.p...@aiven.io <i...@aiven.io> | +49 170 > > > 9869525 > > > > > > > >>>>>> aiven.io<https://www.aiven.io> | > > > > > > > >>>>>> <https://www.facebook.com/aivencloud> > > > > > > > >>>>>> <https://www.linkedin.com/company/aiven/> < > > > > > > > >>>> https://twitter.com/aiven_io> > > > > > > > >>>>>> *Aiven Deutschland GmbH* > > > > > > > >>>>>> Alexanderufer 3-7, 10117 Berlin > > > > > > > >>>>>> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen > > > > > > > >>>>>> Amtsgericht Charlottenburg, HRB 209739 B > > > > > > > >>>>>> > > > > > > > > > > > > > > > > > > > > >