On Mon, Jul 13, 2015 at 7:24 PM, Jochen Wiedmann
<jochen.wiedm...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>

This is a severe problem and needs to be rectified promptly.
How are commits migrating from the external repository to the ASF repository?
We typically end up missing provenance information that is important
in blind mirroring of content.



>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>

Agreed. This needs to be rectified rapidly.


> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>

Git is very widely used - slightly over 1/2 of the active projects at
the ASF are using it now as their primary VCS.

That said, we've explored gerrit a number of times, most recently in
December. Just for frame of reference, I was very much in favor of
Gerrit. I thought that there were a number of projects who also wanted
it - but many of those changed their mind over time. In the end we
discovered that there are a number of challenges:

First, Gerrit wants what is best described as exclusive access to git
repositories. It tends to want them on a local filesystem, and
essentially acting as gatekeeper for commits. This isn't inherently a
problem if you have all repos treated this way. But since we don't
have all projects wanting
Second: Gerrit wants every patch author authenticated against a common
authn backend. This would mean folks would need accounts in LDAP. When
we explored this last our LDAP infrastructure was incapable of what
would have been an explosive growth in number of accounts and
authentication requests. We've since made the infrastructure much more
robust and resilient, but deploying gerrit would essentially require
us to have a self-service account creation service, and that's a lot
of work.

At the moment, Infrastructure doesn't see enough demand from projects
requesting Gerrit to make the tremendous investment required. We've
also noticed a number of trends in projects who were interested in
this:

The oldest strategy is from Hadoop, and they have every patch
submitted to Jira. Every patch is automatically detected, and has a
pre-commit test job run, with Jenkins reporting to Jira the results of
the tests.

We have Reviewboard[0], which is gerrit-like, without the problems
listed above.

More recently, we have folks making heavy use of github pull requests.
and there are two primary technologies that are being seen there.
1. Github pull request builder: Jenkins watches for pull requests
against the GH mirror of the repo, and automatically picks up the job,
and then reports the success or failure of that job in the pull
request.

2. TravisCI - The ASF has a paid account with TravisCI and has 30
concurrent builders. Like the Github Pull Request Builder, it watches
for pull requests against a repository and then runs tests, and then
reports against the pull request. [2]

Obviously, there's no automatic merge, or even technical enforcement.
However, most projects are able to use social enforcement (and reverts
if necessary) to ensure that folks aren't committing directly; and
automatic merges would be disallowed anyway since a committer needs to
make an explicit decision to commit.


[0] http://reviews.apache.org
[1] https://blogs.apache.org/infra/entry/github_pull_request_builds_now
[2] https://blogs.apache.org/infra/entry/apache_gains_additional_travis_ci


--David

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to