On Thu, Oct 10, 2019 at 12:22 AM Jacques Nadeau <jacq...@apache.org> wrote:
>
> I'm not dismissing the there are issues but I also don't feel like there
> has been constant discussion for months on the list that INFRA is not being
> responsive to Arrow community requests. It seems like you might be saying a
> couple different things one of two things (or both?)?
>
> 1) The Arrow infrastructure requirements are vastly different than other
> projects. Because of Arrow's specialized requirements, we need things that
> no other project needs.
> 2) There are many projects that want CircleCI, Buildkite and Azure
> pipelines but Infrastructure is not responsive. This is putting a big
> damper on the success of the Arrow project.

Yes, I'm saying both of these things.

1. Yes, Arrow is special -- validating the project requires running a
dozen or more different builds (with dozens more nightly builds) that
test different parts of the project. Different language components, a
large and diverse packaging matrix, and interproject integration tests
and integration with external projects (e.g. Apache Spark adn others)

2. Yes, the limited GitHub App availability is hurting us.

I'm OK to place this concern in the "Community Health" section and
spend more time building a comprehensive case about how Infra's
conservatism around Apps is causing us to work with one hand tied
behind our back. I know that I'm not the only one who is unhappy, but
I'll let the others speak for themselves.

> For each of these, if we're asking the board to do something, we should say
> more and more clearly. Sure, CI is a pain in the Arrow project's a**. I
> also agree that community health is impacted by the challenge to merge
> things. I also share the perspective that the foundation has been slow to
> adopt new technologies and has been way to religious about svn. However, If
> we're asking the board to do something, what is it?

Allow GitHub Apps that do not require write access to the code itself,
set up appropriate checks and balances to ensure that the Foundation's
IP provenance webhooks are preserved.

> Looking at the two things you might be saying...
> If 1, are we confident in that? Many other projects have pretty complex
> build matrices I think. (I haven't thought about this and evaluated the
> other projects...maybe it is true.) If 1, we should clarify why we think
> we're different. If that is the case, what are asking for from the board.
>
> If 2, and you are proposing throwing stones at INFRA, we should back it up
> with INFRA tickets and numbers (e.g. how many projects have wanted these
> things and for how long). We should reference multiple threads on the INFRA
> mailing list where we voiced certain concerns and many other people voiced
> similar concerns and INFRA turned a deaf ear or blind eye (maybe these
> exist, I haven't spent much time on the INFRA list lately). As it stands,
> the one ticket referenced in this thread is a ticket that has only one
> project asking for a new integration that has been open for less than a
> week. That may be annoying but it doesn't seem like something that has
> gotten to the level that we need to get the boards help.
>
> In a nutshell, I agree that this is impacting the health and growth of the
> project but think we should cover that in the community health section of
> the report. I'm less a fan of saying this is an issue the board needs to
> help us solve unless it has been a constant point of pain that we've
> attempted to elevate multiple times in infra forums and experienced
> unreasonable responses. The board is a blunt instrument and should only be
> used when we have depleted every other avenue for resolution.
>

Yes, I'm happy to spend more time building a comprehensive case before
escalating it to the board level. However, Apache Arrow is a high
profile project and it is not a good luck to have a PMC in a
fast-growing project growing disgruntled with the Foundation's
policies in this way. We've been struggling visibly for a long time
with our CI scalability, and I think we should have all the options on
the table to utilize GitHub-integrated tools to help us find a way out
of the mess that we are in.

>
> On Wed, Oct 9, 2019 at 9:44 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Jacques,
> >
> > I think we need to share the concerns that many PMC members have over
> > the constraints that INFRA is placing on us. Can we rephrase the
> > concern in a way that is more helpful?
> >
> > Firstly, I respect and appreciate the ASF's desire to limit write
> > access to committers only from an IP provenance perspective. I
> > understand that GitHub webhooks are used to log actions taken in
> > repositories to secure IP provenance. I do not think a third party
> > application should be given the ability to commit or modify a
> > repository -- all write operations on the .git repository should be
> > initiated by committers.
> >
> > However, GitHub is the main platform for producing open source
> > software, and tools are being created to help produce open source more
> > efficiently. It is frustrating for us to not be able to take advantage
> > of the tools that are available to everyone else on GitHub. I brought
> > up the recent request about Buildkite as being representative of this
> > (after learning that Google has been making a lot of use of it), but
> > we have previously been denied use of CircleCI and Azure Pipelines
> > since those services require even more permissions (AFAIK) than in the
> > case of Buildkite. From our use in
> > https://github.com/ursa-labs/crossbow CircleCI and Azure seem to be a
> > lot better than Travis CI and Appveyor
> >
> > I think the ASF is going to face an existential crisis in the near
> > future whether it wants to live in 2020 or 2000. It feels like GitHub
> > is treated somewhat as ersatz SVN "because people want to use git +
> > GitHub instead of SVN"
> >
> > In the same way that the cloud revolutionized software startups,
> > enabling small groups of developers to build large SaaS applications,
> > the same kind of leverage is becoming available to open source
> > developers to set up infrastructure to automate and scale open source
> > projects. I think projects considering joining the Foundation are
> > going to look at these issues around App usage and decide that they
> > would rather be in control of their own infrastructure.
> >
> > I can set aside even more time and money from my non-profit
> > organization's modest budget to do CI work for Apache Arrow. The
> > amount that we have invested already is very large, and continues to
> > grow. I'm raising these issues because as Member of the Foundation I'm
> > concerned that fast-growing projects like ours are not being
> > adequately served by INFRA, and we probably aren't the only project
> > that will face these issues. All that is needed is for INFRA to let us
> > use third party GitHub Apps and monitor any potentially destructive
> > actions that they may take, such as modifying unrelated repository
> > webhooks related to IP provenance.
> >
> > - Wes
> >
> > On Wed, Oct 9, 2019 at 9:33 PM Jacques Nadeau <jacq...@apache.org> wrote:
> > >
> > > I think we need to more direct in listing issues for the board.
> > >
> > > What have we done? What do we want them to do?
> > >
> > > In general, any large org is going to be slow to add new deep
> > integrations
> > > into GitHub. I don't think we should expect Apache to be any different
> > (it
> > > took several years before we could merge things through github for
> > > example). If I were on the INFRA side, I think I would look and see how
> > > many different people are asking for BuildKite before considering
> > > integration. It seems like we only opened the JIRA 6 days ago and no
> > other
> > > projects have requested access to this?
> > >
> > > I'm not clear why this is a board issue. What do we think the board can
> > do
> > > for us that we can't solve ourselves and need them to solve? Remember, a
> > > board solution to a problem is typically very removed from what matters
> > to
> > > individuals on a project.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Oct 8, 2019 at 7:03 AM Wes McKinney <wesmck...@gmail.com> wrote:
> > >
> > > > New draft
> > > >
> > > > ## Description:
> > > > The mission of Apache Arrow is the creation and maintenance of software
> > > > related
> > > > to columnar in-memory processing and data interchange
> > > >
> > > > ## Issues:
> > > >
> > > > * We are struggling with Continuous Integration scalability as the
> > project
> > > > has
> > > >   definitely outgrown what Travis CI and Appveyor can do for us. Some
> > > >   contributors have shown reluctance to submit patches they aren't sure
> > > > about
> > > >   because they don't want to pile on the build queue. We are exploring
> > > >   alternative solutions such as Buildbot, Buildkite, and GitHub
> > Actions to
> > > >   provide a path to migrate away from Travis CI / Appveyor. In our
> > request
> > > > to
> > > >   Infrastructure INFRA-19217, some of us were alarmed to find that an
> > CI/CD
> > > >   service like Buildkite may not be able to be connected to the @apache
> > > > GitHub
> > > >   account on account of requiring admin access to repository webhooks,
> > but
> > > > no
> > > >   ability to modify source code. There are workarounds (building custom
> > > > OAuth
> > > >   bots) that could enable us to use Buildkite, but it would require
> > extra
> > > >   development and result in a less refined experience for community
> > > > members.
> > > >
> > > > ## Membership Data:
> > > > * Apache Arrow was founded 2016-01-19 (4 years ago)
> > > > * There are currently 48 committers and 28 PMC members in this project.
> > > > * The Committer-to-PMC ratio is roughly 3:2.
> > > >
> > > > Community changes, past quarter:
> > > > - Micah Kornfield was added to the PMC on 2019-08-21
> > > > - Sebastien Binet was added to the PMC on 2019-08-21
> > > > - Ben Kietzman was added as committer on 2019-09-07
> > > > - David Li was added as committer on 2019-08-30
> > > > - Kenta Murata was added as committer on 2019-09-05
> > > > - Neal Richardson was added as committer on 2019-09-05
> > > > - Praveen Kumar was added as committer on 2019-07-14
> > > >
> > > > ## Project Activity:
> > > >
> > > > * The project has just made a 0.15.0 release.
> > > > * We are discussing ways to make the Arrow libraries as accessible as
> > > > possible
> > > >   to downstream projects for minimal use cases while allowing the
> > > > development
> > > >   of more comprehensive "standard libraries" with larger dependency
> > stacks
> > > > in
> > > >   the project
> > > > * We plan to make a 1.0.0 release as our next major release, at which
> > time
> > > > we
> > > >   will declare that the Arrow binary protocol is stable with forward
> > and
> > > >   backward compatibility guarantees
> > > >
> > > > ## Community Health:
> > > >
> > > > * The community is overall healthy, with the aforementioned concerns
> > > > around CI
> > > >   scalability. New contributors frequently take notice of the long
> > build
> > > > queue
> > > >   times when submitting pull requests.
> > > >
> > > > On Tue, Oct 8, 2019 at 8:58 AM Wes McKinney <wesmck...@gmail.com>
> > wrote:
> > > > >
> > > > > Yes, I agree with raising the issue to the board.
> > > > >
> > > > > On Tue, Oct 8, 2019 at 8:31 AM Antoine Pitrou <anto...@python.org>
> > > > wrote:
> > > > > >
> > > > > >
> > > > > > I agree.  Especially given that the constraints imposed by Infra
> > don't
> > > > > > help solving the problem.
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > Antoine.
> > > > > >
> > > > > >
> > > > > > Le 08/10/2019 à 15:02, Uwe L. Korn a écrit :
> > > > > > > I'm not sure what qualifies for "board attention" but it seems
> > that
> > > > CI is a critical problem in Apache projects, not just Arrow. Should we
> > > > raise that?
> > > > > > >
> > > > > > > Uwe
> > > > > > >
> > > > > > > On Tue, Oct 8, 2019, at 12:00 AM, Wes McKinney wrote:
> > > > > > >> Here is a start for our Q3 board report
> > > > > > >>
> > > > > > >> ## Description:
> > > > > > >> The mission of Apache Arrow is the creation and maintenance of
> > > > software related
> > > > > > >> to columnar in-memory processing and data interchange
> > > > > > >>
> > > > > > >> ## Issues:
> > > > > > >> There are no issues requiring board attention at this time
> > > > > > >>
> > > > > > >> ## Membership Data:
> > > > > > >> * Apache Arrow was founded 2016-01-19 (4 years ago)
> > > > > > >> * There are currently 48 committers and 28 PMC members in this
> > > > project.
> > > > > > >> * The Committer-to-PMC ratio is roughly 3:2.
> > > > > > >>
> > > > > > >> Community changes, past quarter:
> > > > > > >> - Micah Kornfield was added to the PMC on 2019-08-21
> > > > > > >> - Sebastien Binet was added to the PMC on 2019-08-21
> > > > > > >> - Ben Kietzman was added as committer on 2019-09-07
> > > > > > >> - David Li was added as committer on 2019-08-30
> > > > > > >> - Kenta Murata was added as committer on 2019-09-05
> > > > > > >> - Neal Richardson was added as committer on 2019-09-05
> > > > > > >> - Praveen Kumar was added as committer on 2019-07-14
> > > > > > >>
> > > > > > >> ## Project Activity:
> > > > > > >>
> > > > > > >> * The project has just made a 0.15.0 release.
> > > > > > >> * We are discussing ways to make the Arrow libraries as
> > accessible
> > > > as possible
> > > > > > >>   to downstream projects for minimal use cases while allowing
> > the
> > > > development
> > > > > > >>   of more comprehensive "standard libraries" with larger
> > dependency
> > > > stacks in
> > > > > > >>   the project
> > > > > > >> * We plan to make a 1.0.0 release as our next major release, at
> > > > which time we
> > > > > > >>   will declare that the Arrow binary protocol is stable with
> > > > forward and
> > > > > > >>   backward compatibility guarantees
> > > > > > >> * We are struggling with Continuous Integration scalability as
> > the
> > > > project has
> > > > > > >>   definitely outgrown what Travis CI and Appveyor can do for
> > us. We
> > > > are
> > > > > > >>   exploring alternative solutions such as Buildbot, Buildkite
> > (see
> > > > > > >>   INFRA-19217), and GitHub Actions to provide a path to migrate
> > > > away from
> > > > > > >>   Travis CI / Appveyor
> > > > > > >>
> > > > > > >> ## Community Health:
> > > > > > >>
> > > > > > >> * The community is overall healthy, with the aforementioned
> > > > concerns around CI
> > > > > > >>   scalability. New contributors frequently take notice of the
> > long
> > > > build queue
> > > > > > >>   times when submitting pull requests.
> > > > > > >>
> > > >
> >

Reply via email to