Hi,

Following up on this thread, I am going to try and coordinate to set up an
instance of the self-hosted runners for arm64 on the Arrow repository.

There was a question about using Travis CI on Crossbow for those jobs. That
could be a possibility but I think there are some benefits to the proposed
solution:
- Having the possibility to have those runners on the Arrow repo will allow
us to run these jobs on a PR basis, as we do today instead of as external
adhoc tests.
- Moving those jobs to GHA would be beneficial for maintenance purposes.
That's where the majority of our CI is hosted. Trying to get rid of a CI
system (travis).
- We are already lacking resources on arm64 CI on Crossbow. We had to
remove libarrow-flight-dev packages built for arm64. See:
https://github.com/apache/arrow/issues/33934
- Finding a solution that allows us to increase the number of runners on
the Arrow repo and run the CI from the Arrow repo would be beneficial not
only for those jobs but for extra CI capacity if/when needed for future
purposes.

About the s390x jobs there is some Apache INFRA CI on Jenkins that could be
used if we can't find an alternative. I've asked on ASF Slack for more
information about that and here are a couple of examples of builds on other
Apache projects for s390x:
https://github.com/apache/camel/blob/e7825a48c9f3d1202333c4f311330be55ff30257/Jenkinsfile.s390x#L20
https://github.com/apache/activemq/blob/c58286487d08d155496e571db649f047bd979630/Jenkinsfile#L45

To be honest it doesn't seem ideal to add a new CI system but if we can't
find other possibilities for s390x hosts and we want to maintain them on CI
I can't think of others.

Kind regards,
Raúl

El jue, 22 dic 2022 a las 22:20, Sutou Kouhei (<k...@clear-code.com>)
escribió:

> Hi,
>
> We can keep using Travis CI via Crossbow by the following
> approach:
> https://github.com/apache/arrow/pull/14751
>
> Travis CI for https://github.com/ursacomputing/crossbow is
> sponsored by Voltron Data (not ASF) for arm64 Linux
> packages.
>

> How about using the approach for s390x?
>
>
> Thanks,
> --
> kou
>
> In <canva0dgp8ifmdno8a7o8msbwvtl6kgprqcmsk0nncqveqqt...@mail.gmail.com>
>   "Re: [DISC] Self-Hosted Runners for Arrow" on Fri, 16 Dec 2022 19:26:36
> +0100,
>   Jacob Wujciak <ja...@voltrondata.com.INVALID> wrote:
>
> > No news with regards to arrow specific S390x machines but apparently IBM
> > has donated a number of S390x VMs to the ASF which we should be able to
> use
> > but I have not had the time yet to investigate this option.
> >
> >
> > Matt Topol <zotthewiz...@gmail.com> schrieb am Fr., 16. Dez. 2022,
> 17:01:
> >
> >> These are awesome! Has there been any luck in reaching out to IBM to
> see if
> >> they could donate one or more s390x VMs to use as runners for testing
> the
> >> s390x builds? That is probably my only concern with Travis going away at
> >> EOY, since we don't have a way currently to test those builds on GH
> >> Actions.
> >>
> >> --Matt
> >>
> >> On Fri, Dec 16, 2022 at 8:46 AM Jacob Wujciak
> >> <ja...@voltrondata.com.invalid>
> >> wrote:
> >>
> >> > I would like to propose the addition of a self-hosted runner system to
> >> the
> >> > arrow repository to add speciality runners (arm64 and CUDA). This will
> >> > allow us to compensate for the arm64 jobs that previously ran on
> Travis,
> >> > which will be turned off EOY[1].
> >> >
> >> > The migration to GitHub Issues will require a significant extension of
> >> our
> >> > existing “comment bot”-workflows (e.g. assigning and labeling issues
> for
> >> > non-committers, see [3]), with such a system we could add reserved
> >> runners
> >> > that only pick up these “comment bot”-jobs to guarantee a smooth
> >> developer
> >> > experience, regardless of the state of the ASF CI resources.
> >> >
> >> > As the allocation of GitHub-hosted runners for the Apache software
> >> > foundation was recently increased, the queue times are currently low,
> but
> >> > this will inevitably change and such a system would enable us to react
> >> > quickly to such changes by adding new Windows and Linux nodes without
> any
> >> > need for INFRA intervention.
> >> >
> >> > We at Voltron Data have been working on a Kubernetes based system to
> >> deploy
> >> > auto-scaling ephemeral GitHub runners that can be seamlessly added to
> the
> >> > arrow repository via a Github App. As the runners are ephemeral (each
> job
> >> > is run in an isolated environment that is destroyed once the job is
> done)
> >> > the usual security issues with self-hosted runners do not apply [2].
> >> >
> >> > Voltron Data has open sourced the necessary Infrastructure as Code
> [4],
> >> > this makes it possible for other interested parties to donate CI
> capacity
> >> > to arrow or other ASF projects by cloning the IaC, setting up and
> >> > maintaining their own Instance of the system. Voltron Data will set up
> >> and
> >> > maintain one instance of the system.
> >> >
> >> > The dockerfiles for the runners will be added to the main arrow repo
> to
> >> > facilitate easy changes and updates to the runner configuration for
> the
> >> > community.
> >> >
> >> > Best,
> >> > Jacob
> >> >
> >> > [1]:
> https://cwiki.apache.org/confluence/display/INFRA/Travis+Migrations
> >> >
> >> > [2]:
> >> >
> >> >
> >>
> https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security
> >> >
> >> > [3]:
> https://github.com/apache/arrow/actions/workflows/comment_bot.yml
> >> >
> >> > [4]: https://github.com/voltrondata-labs/gha-controller-infra
> >> >
> >>
>

Reply via email to