Hello Chesnay, I think you have a bit too high of expectations and I am not sure why.
Not sure who you talked to at Airflow, but we always underline and stress and warn that our solution is really "experimental" and "works for us" because we invested awfully (and I mean awfully) lot of time in making it secure FOR US ONLY and anyone doing so must be prepared for similar effort. We have very specific workflows and very custom ways of dealing with our build and by all means we are quite far from "it just works". And we do help a bit the Beam team to make it more "reusable" but this is still awfully a lot of work (that mostly Beam team did). I personally have an impression that investing more in it before Github fixes all the security issues to make it 'generally working' is a waste of time. Since you refer to this https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status and I am quite surprised you had high expectations. There is this very sentence at the very top in the summary: - If you want to use GitHub Actions, consider using your own self-hosted runner, but only if you can afford to build and maintain your own self-hosted infrastructure (this is not an easy task due to security limitations of the official GitHub Actions runners). Have you done any of this? Have you built your custom infrastructure with the security in mind? Do you know what it involves? If not - then forget it. It's insecure by design. If you read further down the document you will find another sentence: One of the solutions that might be sustainable is to deploy self-hosted runners if your project has some infrastructure money (from stakeholders/sponsors) they can spend. We have money in Airflow (from the AWS Open-Source initiative and Astronomer; also Google promised to donate some GCP time). This is, however, (currently) inherently insecure. With the "PR-s from forks" approach of Apache projects, the current model of GitHub Runners is not secure by default. In fact, there is a recommendation from GitHub to NEVER use self-hosted runners for public repositories . Apache Airflow team forked the Runner and we are working on hardening the Self-hosted runners from GitHub, and we set up auto-scaling runners in our donated infrastructure (PMC member of Airflow - Ash Berlin-Taylor is working on it), but this is a big project on its own. Just to extract a bit from there "there is a recommendation from GitHub to NEVER use self-hosted runners for public repositories " and "inherently insecure". This links to the official GitHub recommendations regarding using self-hosted repositories - which is (still after 2 years) DON'T USE IT: https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories If the above does not scare you then I am not sure what would :) I am closely monitoring new features of GitHub Actions and while they improved a lot since - this aspect remained largely unchanged. There are some ways they introduced recently that might change things but (at least for now) they are not good enough at least for us. J. On Wed, Apr 6, 2022 at 11:56 AM Chesnay Schepler <ches...@apache.org> wrote: > > Did you find some documentation somewhere that we might have said > otherwise? > > We knew that Airflow is using them and thus thought it would be fine. > We also had a chat with the Airflow folks and IIRC it also wasn't > mentioned. > > There were several tickets where other projects requested token where no > limitation was mentioned: > * Arrow; token was provided: > https://issues.apache.org/jira/browse/INFRA-19875 > * Beam: https://issues.apache.org/jira/browse/INFRA-22840 > * Zeppelin: https://issues.apache.org/jira/browse/INFRA-22674 > And in fact our own latest request for 2 tokens was also granted in > https://issues.apache.org/jira/browse/INFRA-23086. The alarm bells only > went off when we requested more tokens. > > Then we have https://infra.apache.org/self-hosted-runners.html which > states /"//Apache permits projects to use self-hosted runners [but does > not recommend them]./ > / > / > At last, we have > https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status > (admittedly not an official INFRA resource, but it is linked in some > INFRA tickets / discussions), which again lists self-hosted runners as > an option (while listing /caveats/)./ > / > / > / > TL;DR://There was plenty of information from which one would conclude > that self-hosted runners are allowed, and no information to the contrary. > // > > > On 06/04/2022 11:43, Gavin McDonald wrote: > > Hi. > > > > On Wed, Apr 6, 2022 at 11:31 AM Chesnay Schepler<ches...@apache.org> > wrote: > > > >> Hello, > >> > >> Inhttps://issues.apache.org/jira/browse/INFRA-23086 it was mentioned > >> that a security audit of self-hosted runners for github actions is being > >> conducted at the moment, and that until this is complete no significant > >> number of self-hosted runners can be set up. > >> This came as a bit of a surprise to us (the Flink project); we wanted to > >> complete our migration to github actions within the next 2-3 weeks, > >> which is now effectively blocked. > >> > > I wanted to ask about this part, why was it a surprise? > > > > Self Hosted Github Runners > > has never been approved for general projects use at the moment. Did you > > find > > some documentation somewhere that we might have said otherwise? > > > > We are still evaluating a safe and secure way in which we can deploy self > > hosted runners > > at the ASF. Currently Airflow are the only approved project, and we are > > working with Beam > > to ensure the same level of security if not better. the result of this > > experiment will determine > > when we can open up self hosted runners for all projects. > > > > 2 to 3 weeks MIGHT be do-able but I'll let you know, still working with > > Beam currently. > > > > > >> I wanted to ask whether there is some form of ETA on when this audit is > >> complete. > >> > >> Regards, > >> Chesnay > >> > >> > >> > >> >