Hello Chesnay,

I think you have a bit too high of expectations and I am not sure why.

Not sure who you talked to at Airflow, but we always underline and stress
and warn that our solution is really "experimental" and "works for us"
because we invested awfully (and I mean awfully) lot of time in making it
secure FOR US ONLY and anyone doing so must be prepared for similar effort.
We have very specific workflows and very custom ways of dealing with our
build and by all means we are quite far from "it just works". And we do
help a bit the Beam team to make it more "reusable" but this is still
awfully a lot of work (that mostly Beam team did). I personally have an
impression that investing more in it before Github fixes all the security
issues to make it 'generally working' is a waste of time.

Since you refer to this
https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status and
I am quite surprised you had high expectations. There is this very sentence
at the very top in the summary:

   - If you want to use GitHub Actions, consider using your own self-hosted
   runner, but only if you can afford to build and maintain your own
   self-hosted infrastructure (this is not an easy task due to security
   limitations of the official GitHub Actions runners).


Have you done any of this? Have you built your custom infrastructure with
the security in mind? Do you know what it involves?  If not  - then forget
it. It's insecure by design.

If you read further down the document you will find another sentence:

One of the solutions that might be sustainable is to deploy self-hosted
runners if your project has some infrastructure money (from
stakeholders/sponsors) they can spend. We have money in Airflow (from the
AWS Open-Source initiative and Astronomer; also Google promised to donate
some GCP time). This is, however, (currently) inherently insecure. With the
"PR-s from forks" approach of Apache projects, the current model of GitHub
Runners is not secure by default. In fact, there is a recommendation from
GitHub to NEVER use self-hosted runners for public repositories . Apache
Airflow team forked the Runner and we are working on hardening the
Self-hosted runners from GitHub,  and we set up auto-scaling runners in our
donated infrastructure (PMC member of Airflow  - Ash Berlin-Taylor  is
working on it), but this is a big project on its own.

Just to extract a bit from there "there is a recommendation from GitHub to
NEVER use self-hosted runners for public repositories " and "inherently
insecure".

This links to the official GitHub recommendations regarding using
self-hosted repositories - which is (still after 2 years) DON'T USE IT:
https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories

If the above does not scare you then I am not sure what would :)

I am closely monitoring new features of GitHub Actions and while they
improved a lot since - this aspect remained largely unchanged. There are
some ways they introduced recently that might change things but (at least
for now) they are not good enough at least for us.

J.


On Wed, Apr 6, 2022 at 11:56 AM Chesnay Schepler <ches...@apache.org> wrote:

>  > Did you find some documentation somewhere that we might have said
> otherwise?
>
> We knew that Airflow is using them and thus thought it would be fine.
> We also had a chat with the Airflow folks and IIRC it also wasn't
> mentioned.
>
> There were several tickets where other projects requested token where no
> limitation was mentioned:
> * Arrow; token was provided:
> https://issues.apache.org/jira/browse/INFRA-19875
> * Beam: https://issues.apache.org/jira/browse/INFRA-22840
> * Zeppelin: https://issues.apache.org/jira/browse/INFRA-22674
> And in fact our own latest request for 2 tokens was also granted in
> https://issues.apache.org/jira/browse/INFRA-23086. The alarm bells only
> went off when we requested more tokens.
>
> Then we have https://infra.apache.org/self-hosted-runners.html which
> states /"//Apache permits projects to use self-hosted runners [but does
> not recommend them]./
> /
> /
> At last, we have
> https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status
> (admittedly not an official INFRA resource, but it is linked in some
> INFRA tickets / discussions), which again lists self-hosted runners as
> an option (while listing /caveats/)./
> /
> /
> /
> TL;DR://There was plenty of information from which one would conclude
> that self-hosted runners are allowed, and no information to the contrary.
> //
>
>
> On 06/04/2022 11:43, Gavin McDonald wrote:
> > Hi.
> >
> > On Wed, Apr 6, 2022 at 11:31 AM Chesnay Schepler<ches...@apache.org>
> wrote:
> >
> >> Hello,
> >>
> >> Inhttps://issues.apache.org/jira/browse/INFRA-23086  it was mentioned
> >> that a security audit of self-hosted runners for github actions is being
> >> conducted at the moment, and that until this is complete no significant
> >> number of self-hosted runners can be set up.
> >> This came as a bit of a surprise to us (the Flink project); we wanted to
> >> complete our migration to github actions within the next 2-3 weeks,
> >> which is now effectively blocked.
> >>
> > I wanted to ask about this part, why was it a surprise?
> >
> > Self Hosted Github Runners
> > has never been approved for general projects use at the moment. Did you
> > find
> > some documentation somewhere that we might have said otherwise?
> >
> > We are still evaluating a safe and secure way in which we can deploy self
> > hosted runners
> > at the  ASF. Currently Airflow are the only approved project, and we are
> > working with Beam
> > to ensure the same level of security if not better. the result of this
> > experiment will determine
> > when we can open up self hosted runners for all projects.
> >
> > 2 to 3 weeks MIGHT be do-able but I'll let you know, still working with
> > Beam currently.
> >
> >
> >> I wanted to ask whether there is some form of ETA on when this audit is
> >> complete.
> >>
> >> Regards,
> >> Chesnay
> >>
> >>
> >>
> >>
>

Reply via email to