This uses https://github.com/actions/runner/pull/783 to not have un-trusted users run code (security is based on the actors of the commit - commiter’s PRs and direct pushes are allowed to run builds on self-hosted runners) on our hosts, and then a combination of a Github Application, AWS Lambda and an AWS Auto-Scaling Group
pon., 8 lut 2021, 09:58 użytkownik Antoine Pitrou <anto...@python.org> napisał: > > Hi Jarek, > > Thank you for the document. Could you tell us more about the "custom > security layer" that you implemented? > > Regards > > Antoine. > > > Le 08/02/2021 à 01:44, Jarek Potiuk a écrit : > > For anyone following this thread - some update from the progress we have > in > > Airflow on building self-hosted infrastructure for GitHub actions. > > > > Ash from Airflow is really close to finalizing the work on a nice > > auto-scaling framework for self-hosted workers, but also we checked what > is > > the best value for money we can get. > > > > I've run some analysis on the performance and tested my hypothesis (based > > on earlier experiences) of significant optimisations we can get. > > > > I've finished my analysis of potential optimizations we can get on our CI > > with the Self-Hosted runners that Ash created. I did some performance > > testing and (very crude) comparison of "traditional approach" with Local > > SSDs 2 CPU instances running the tests with something I already tested > > several times on various CI arrangements - running tests with High-Memory > > instances (8CPU 64 GB Mem) and running everything (including docker > engine) > > in "tmpfs" - huge ramdisk. > > Seems that 1h 20 minutes of test running can be decreased 8x (!)using > this > > approach (and parallelising some tests) at the same time decreasing the > > cost 2x (!). Yep. You heard right. We can have faster builds this way and > > pay less for that. Seems that we will be able to decrease the time to run > > all tests for one combination to 10 minutes from 1h20 minutes. > > This is possible because Ash and his team did a great job on setting up > > auto-scaling EC2 instance runners on our Amazon EC2 account (we have > > credits from Amazon to run those jobs - also Astronomer offered donation > to > > keep it running ). Seems that by utilizing it we can not only pay less > but > > also get much faster builds. > > > > If you are interested - my document is here. Open for comments - happy to > > add you as editors if you want (just send me your gmail address in priv). > > It is rather crude, I had no time to put a bit more effort into it due to > > some significant changes in my company, but it should be easy to compare > > the values and see the actual improvements we can get. There are likely a > > few shortcuts there and some of the numbers are "back-of-the-envelope" > and > > we are going to validate them even more when we implement all the > > optimisations, but the conclusions should be pretty sound. > > > > > https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit# > > > > J. > > > > > > On Fri, Jan 8, 2021 at 10:02 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > >> > >> We should be able to make an efficient query via GraphQL API right? I > found > >>> the REST API for actions to be a little underwhelming. > >> > >> > >> That was the first thing I checked when we started looking at the stats. > >> Unfortunately last time that I checked (and I even opened an issue for > >> that to > >> Github support) there was not a Github Actions GraphQL API. > >> > >> I got a GH support answer "Yeah we know GH API does not have > >> GraphQL support yet, sorry". I think it has not changed since. > >> > >> > >> We have tried to make our builds faster with more caching but it's not > easy > >>> since it's an embedded systems project we need to target a lot of > >>> configurations and most changes impact all builds. > >>> > >> > >> Indeed, I know how much of my time was spent on optimising Airflow GH > >> usage. > >> I think we eventually decreased the usage 10x or more. But it never > >> helped, for a > >> long as currently anyone even accidentally could block all the slots in > >> almost no > >> time at all. We have no organisation-wide way to block this and this is > >> the problem. > >> > >> Right now I could: > >> a) mine cryptocurrency using PRs to any Apache project > >> b) block the queue for everone > >> > >> I do not have to be even an Apache committer to do that. It's enough if > >> just open one PR > >> which is well crafted and spins of 180 jobs that run for 6 hours. It's > >> super-flawed. > >> > >> > >>> > >>> We too would like to would like to take advantage of our own runners > but > >>> more for the ability to do Hardware In the Loop testing but have > avoided > >>> it > >>> for the reasons already mentioned. > >>> > >> > >> Self-hosted runner for now seems to be the only "Reasonable" option but > >> the security > >> issues with the current runner are not allowing us to do it. > >> > >>> > >>> --Brennan > >>> > >> > >> > >> -- > >> +48 660 796 129 > >> > > > > >