On 2019/01/06 18:32:24, Allen Wittenauer <a...@effectivemachines.com.INVALID>
wrote:
>
> a) The ASF has been running untrusted code since before Github existed. From
> my casual watching of Jenkins, most of the change code we run doesn’t come
> from Github PRs. Any solution absolutely needs to consider what happens in a
> JIRA-based patch file world. [footnote 1,2]
There is a big difference between things like JIRA patches and pull requests on
GitHub.
A pull request from GitHub *can* be checked out correctly without human
intervention. Patch files require knowing the exact base and how much of the
prefix to strip off.
Further, the ASF does not control GitHub signups (a good thing for growing the
community, but it produces a consequent lower level of assumed trust)
I have seen GitHub PRs against other public repos where there have been
drive-by bitcoin mining by modifying a unit test to run the miner. (And that's
just the attack vectors that are public and hence safe to disclose)
The GitHub PR is a potent attack vector (which is why Jenkins provides some
tools to help... the tools could be better, but my day job has me focused on
non Jenkins stuff right now... I haven't been paid to work on Jenkins stuff for
18 months)
>
> b) Making everything get reviewed by a committer before executing is a
> non-starter. For large communities, precommit testing acts as a way for
> contributors to get feedback prior to a committer even getting involved.
> This allows for change iteration prior to another human spending time on it.
> But the secondary effect is that it acts as a funnel: if a project gets
> thousands of change requests a year [footnote 3], it’s now trivial for
> committers to focus their energy on the ones that are closest to commit.
>
Well there are alternatives:
1. We can leverage services that are non-ASF to run pre-validation, thereby
letting those services donate their CPU time to the ASF and letting the ASF
gain from that usage. Some examples:
a) TravisCI could be used to run PR round 1 verification (I suspect CodeShip
could also be convinced to provide some CPU time... but as that is a service
provided by my employers - CloudBees - I cannot say for sure... I can provide
the contact to make the ask to if ASF projects are interested);
b) Perhaps Microsoft would donate some CPU time on Azure for builds (like
they currently provide the Jenkins project) that would let the ASF use
disposable one-time build agents on Azure for building
2. We could have some tooling to inspect diffs and permit builds where the
diffs are "trivial" (hence low risk)
> c) We’ve needed disposable environments (what Stephen Connolly called
> throwaway hardware and is similar to what Dominik Psenner talked about wrt
> gitlab runners) for a while. When INFRA enabled multiple executors per node
> (which they did for good reasons), it triggered an avalanche of problems:
> maven’s lack of repo locking, noisy neighbors, Jenkins’ problems galore
> (security and DoS which still exist today!), systemd’s cgroup limitations,
> and a whole lot more. Getting security out of them is really just extra at
> this point.
disposable environments are critical for CI in this day and age IMHO.
>
> ====
>
> 1 - With the forced moved to gitbox, this may change, but time will tell.
>
> 2 - FWIW: Gavin and I have been playing with Jenkins’ JIRA Trigger Plugin
> and finding that it’s got some significant weaknesses and needs a lot of
> support code to make viable. This means we’ll likely be sticking with some
> form of Yetus’ precommit-admin for a while longer. :( So the bright side
> here is that at least the ASF owns the code to make it happen.
>
> 3 - Some perspective: Hadoop generated ~6500 JIRAs with patch files attached
> last year alone for the nearly 15 or so active committers to review. If half
> of the issues had the initial patch plus a single iteration, that’s 13,000
> patches that got tested on Jenkins.