Arg... accidental send before ready.

What do think about the statement below for community health? Does it
fairly capture the concerns/perspective?

On Thu, Oct 10, 2019 at 10:24 AM Jacques Nadeau <jacq...@apache.org> wrote:

> Many contributors are struggling with the slowness of pre-commit CI. Arrow
> has a large number of different platforms and components and a complex
> build matrix. As new commits come in, they frequently take a long time to
> complete. The community is trying several ways to solve this. Some of those
> have been:
>
>    - Try to use CircleCI, rejected in INFRA-15964
>    <https://issues.apache.org/jira/browse/INFRA-15964>
>    - Try to use Azure Pipelines, rejected in INFRA-17030
>    - Try to resolves Issues with Travis CI capacity: INFRA-18533
>    <https://issues.apache.org/jira/browse/INFRA-18533>,
>    https://s.apache.org/ci-capacity (no resolution beyond "find
>    donations")
>    - The creation of new infrastructure design (in progress but a huge
>    amount of thankless work)
>
>
> There is bubbling frustration in the community around the GitHub repo
> rules for using third party services. This is especially challenging when
> there are free solutions to relieve the community pressure but the
> community is unable to access these resources. This frustration is greatest
> among people who work on projects on many OSS projects which don't have
> such restrictive rules around GitHub.
>
> On Thu, Oct 10, 2019 at 5:36 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
>> Here is a rejection of CircleCI more than 18 months ago
>>
>> https://issues.apache.org/jira/browse/INFRA-15964
>>
>> On Thu, Oct 10, 2019 at 4:33 AM Antoine Pitrou <anto...@python.org>
>> wrote:
>> >
>> >
>> > For the record, here is the ticket for Azure Pipelines integration:
>> > https://issues.apache.org/jira/browse/INFRA-17030
>> >
>> > I opened an issue back in May about the Travis-CI capacity situation:
>> > https://issues.apache.org/jira/browse/INFRA-18533
>> >
>> > Apparently CI capacity has been a "hot topic as of late":
>> >
>> https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E
>> >
>> > (I didn't know this list -- bui...@apache.org -- existed, by the way)
>> >
>> > Regards
>> >
>> > Antoine.
>> >
>> >
>> > Le 10/10/2019 à 07:34, Wes McKinney a écrit :
>> > > On Thu, Oct 10, 2019 at 12:22 AM Jacques Nadeau <jacq...@apache.org>
>> wrote:
>> > >>
>> > >> I'm not dismissing the there are issues but I also don't feel like
>> there
>> > >> has been constant discussion for months on the list that INFRA is
>> not being
>> > >> responsive to Arrow community requests. It seems like you might be
>> saying a
>> > >> couple different things one of two things (or both?)?
>> > >>
>> > >> 1) The Arrow infrastructure requirements are vastly different than
>> other
>> > >> projects. Because of Arrow's specialized requirements, we need
>> things that
>> > >> no other project needs.
>> > >> 2) There are many projects that want CircleCI, Buildkite and Azure
>> > >> pipelines but Infrastructure is not responsive. This is putting a big
>> > >> damper on the success of the Arrow project.
>> > >
>> > > Yes, I'm saying both of these things.
>> > >
>> > > 1. Yes, Arrow is special -- validating the project requires running a
>> > > dozen or more different builds (with dozens more nightly builds) that
>> > > test different parts of the project. Different language components, a
>> > > large and diverse packaging matrix, and interproject integration tests
>> > > and integration with external projects (e.g. Apache Spark adn others)
>> > >
>> > > 2. Yes, the limited GitHub App availability is hurting us.
>> > >
>> > > I'm OK to place this concern in the "Community Health" section and
>> > > spend more time building a comprehensive case about how Infra's
>> > > conservatism around Apps is causing us to work with one hand tied
>> > > behind our back. I know that I'm not the only one who is unhappy, but
>> > > I'll let the others speak for themselves.
>> > >
>> > >> For each of these, if we're asking the board to do something, we
>> should say
>> > >> more and more clearly. Sure, CI is a pain in the Arrow project's
>> a**. I
>> > >> also agree that community health is impacted by the challenge to
>> merge
>> > >> things. I also share the perspective that the foundation has been
>> slow to
>> > >> adopt new technologies and has been way to religious about svn.
>> However, If
>> > >> we're asking the board to do something, what is it?
>> > >
>> > > Allow GitHub Apps that do not require write access to the code itself,
>> > > set up appropriate checks and balances to ensure that the Foundation's
>> > > IP provenance webhooks are preserved.
>> > >
>> > >> Looking at the two things you might be saying...
>> > >> If 1, are we confident in that? Many other projects have pretty
>> complex
>> > >> build matrices I think. (I haven't thought about this and evaluated
>> the
>> > >> other projects...maybe it is true.) If 1, we should clarify why we
>> think
>> > >> we're different. If that is the case, what are asking for from the
>> board.
>> > >>
>> > >> If 2, and you are proposing throwing stones at INFRA, we should back
>> it up
>> > >> with INFRA tickets and numbers (e.g. how many projects have wanted
>> these
>> > >> things and for how long). We should reference multiple threads on
>> the INFRA
>> > >> mailing list where we voiced certain concerns and many other people
>> voiced
>> > >> similar concerns and INFRA turned a deaf ear or blind eye (maybe
>> these
>> > >> exist, I haven't spent much time on the INFRA list lately). As it
>> stands,
>> > >> the one ticket referenced in this thread is a ticket that has only
>> one
>> > >> project asking for a new integration that has been open for less
>> than a
>> > >> week. That may be annoying but it doesn't seem like something that
>> has
>> > >> gotten to the level that we need to get the boards help.
>> > >>
>> > >> In a nutshell, I agree that this is impacting the health and growth
>> of the
>> > >> project but think we should cover that in the community health
>> section of
>> > >> the report. I'm less a fan of saying this is an issue the board
>> needs to
>> > >> help us solve unless it has been a constant point of pain that we've
>> > >> attempted to elevate multiple times in infra forums and experienced
>> > >> unreasonable responses. The board is a blunt instrument and should
>> only be
>> > >> used when we have depleted every other avenue for resolution.
>> > >>
>> > >
>> > > Yes, I'm happy to spend more time building a comprehensive case before
>> > > escalating it to the board level. However, Apache Arrow is a high
>> > > profile project and it is not a good luck to have a PMC in a
>> > > fast-growing project growing disgruntled with the Foundation's
>> > > policies in this way. We've been struggling visibly for a long time
>> > > with our CI scalability, and I think we should have all the options on
>> > > the table to utilize GitHub-integrated tools to help us find a way out
>> > > of the mess that we are in.
>> > >
>> > >>
>> > >> On Wed, Oct 9, 2019 at 9:44 PM Wes McKinney <wesmck...@gmail.com>
>> wrote:
>> > >>
>> > >>> hi Jacques,
>> > >>>
>> > >>> I think we need to share the concerns that many PMC members have
>> over
>> > >>> the constraints that INFRA is placing on us. Can we rephrase the
>> > >>> concern in a way that is more helpful?
>> > >>>
>> > >>> Firstly, I respect and appreciate the ASF's desire to limit write
>> > >>> access to committers only from an IP provenance perspective. I
>> > >>> understand that GitHub webhooks are used to log actions taken in
>> > >>> repositories to secure IP provenance. I do not think a third party
>> > >>> application should be given the ability to commit or modify a
>> > >>> repository -- all write operations on the .git repository should be
>> > >>> initiated by committers.
>> > >>>
>> > >>> However, GitHub is the main platform for producing open source
>> > >>> software, and tools are being created to help produce open source
>> more
>> > >>> efficiently. It is frustrating for us to not be able to take
>> advantage
>> > >>> of the tools that are available to everyone else on GitHub. I
>> brought
>> > >>> up the recent request about Buildkite as being representative of
>> this
>> > >>> (after learning that Google has been making a lot of use of it), but
>> > >>> we have previously been denied use of CircleCI and Azure Pipelines
>> > >>> since those services require even more permissions (AFAIK) than in
>> the
>> > >>> case of Buildkite. From our use in
>> > >>> https://github.com/ursa-labs/crossbow CircleCI and Azure seem to
>> be a
>> > >>> lot better than Travis CI and Appveyor
>> > >>>
>> > >>> I think the ASF is going to face an existential crisis in the near
>> > >>> future whether it wants to live in 2020 or 2000. It feels like
>> GitHub
>> > >>> is treated somewhat as ersatz SVN "because people want to use git +
>> > >>> GitHub instead of SVN"
>> > >>>
>> > >>> In the same way that the cloud revolutionized software startups,
>> > >>> enabling small groups of developers to build large SaaS
>> applications,
>> > >>> the same kind of leverage is becoming available to open source
>> > >>> developers to set up infrastructure to automate and scale open
>> source
>> > >>> projects. I think projects considering joining the Foundation are
>> > >>> going to look at these issues around App usage and decide that they
>> > >>> would rather be in control of their own infrastructure.
>> > >>>
>> > >>> I can set aside even more time and money from my non-profit
>> > >>> organization's modest budget to do CI work for Apache Arrow. The
>> > >>> amount that we have invested already is very large, and continues to
>> > >>> grow. I'm raising these issues because as Member of the Foundation
>> I'm
>> > >>> concerned that fast-growing projects like ours are not being
>> > >>> adequately served by INFRA, and we probably aren't the only project
>> > >>> that will face these issues. All that is needed is for INFRA to let
>> us
>> > >>> use third party GitHub Apps and monitor any potentially destructive
>> > >>> actions that they may take, such as modifying unrelated repository
>> > >>> webhooks related to IP provenance.
>> > >>>
>> > >>> - Wes
>> > >>>
>> > >>> On Wed, Oct 9, 2019 at 9:33 PM Jacques Nadeau <jacq...@apache.org>
>> wrote:
>> > >>>>
>> > >>>> I think we need to more direct in listing issues for the board.
>> > >>>>
>> > >>>> What have we done? What do we want them to do?
>> > >>>>
>> > >>>> In general, any large org is going to be slow to add new deep
>> > >>> integrations
>> > >>>> into GitHub. I don't think we should expect Apache to be any
>> different
>> > >>> (it
>> > >>>> took several years before we could merge things through github for
>> > >>>> example). If I were on the INFRA side, I think I would look and
>> see how
>> > >>>> many different people are asking for BuildKite before considering
>> > >>>> integration. It seems like we only opened the JIRA 6 days ago and
>> no
>> > >>> other
>> > >>>> projects have requested access to this?
>> > >>>>
>> > >>>> I'm not clear why this is a board issue. What do we think the
>> board can
>> > >>> do
>> > >>>> for us that we can't solve ourselves and need them to solve?
>> Remember, a
>> > >>>> board solution to a problem is typically very removed from what
>> matters
>> > >>> to
>> > >>>> individuals on a project.
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On Tue, Oct 8, 2019 at 7:03 AM Wes McKinney <wesmck...@gmail.com>
>> wrote:
>> > >>>>
>> > >>>>> New draft
>> > >>>>>
>> > >>>>> ## Description:
>> > >>>>> The mission of Apache Arrow is the creation and maintenance of
>> software
>> > >>>>> related
>> > >>>>> to columnar in-memory processing and data interchange
>> > >>>>>
>> > >>>>> ## Issues:
>> > >>>>>
>> > >>>>> * We are struggling with Continuous Integration scalability as the
>> > >>> project
>> > >>>>> has
>> > >>>>>   definitely outgrown what Travis CI and Appveyor can do for us.
>> Some
>> > >>>>>   contributors have shown reluctance to submit patches they
>> aren't sure
>> > >>>>> about
>> > >>>>>   because they don't want to pile on the build queue. We are
>> exploring
>> > >>>>>   alternative solutions such as Buildbot, Buildkite, and GitHub
>> > >>> Actions to
>> > >>>>>   provide a path to migrate away from Travis CI / Appveyor. In our
>> > >>> request
>> > >>>>> to
>> > >>>>>   Infrastructure INFRA-19217, some of us were alarmed to find
>> that an
>> > >>> CI/CD
>> > >>>>>   service like Buildkite may not be able to be connected to the
>> @apache
>> > >>>>> GitHub
>> > >>>>>   account on account of requiring admin access to repository
>> webhooks,
>> > >>> but
>> > >>>>> no
>> > >>>>>   ability to modify source code. There are workarounds (building
>> custom
>> > >>>>> OAuth
>> > >>>>>   bots) that could enable us to use Buildkite, but it would
>> require
>> > >>> extra
>> > >>>>>   development and result in a less refined experience for
>> community
>> > >>>>> members.
>> > >>>>>
>> > >>>>> ## Membership Data:
>> > >>>>> * Apache Arrow was founded 2016-01-19 (4 years ago)
>> > >>>>> * There are currently 48 committers and 28 PMC members in this
>> project.
>> > >>>>> * The Committer-to-PMC ratio is roughly 3:2.
>> > >>>>>
>> > >>>>> Community changes, past quarter:
>> > >>>>> - Micah Kornfield was added to the PMC on 2019-08-21
>> > >>>>> - Sebastien Binet was added to the PMC on 2019-08-21
>> > >>>>> - Ben Kietzman was added as committer on 2019-09-07
>> > >>>>> - David Li was added as committer on 2019-08-30
>> > >>>>> - Kenta Murata was added as committer on 2019-09-05
>> > >>>>> - Neal Richardson was added as committer on 2019-09-05
>> > >>>>> - Praveen Kumar was added as committer on 2019-07-14
>> > >>>>>
>> > >>>>> ## Project Activity:
>> > >>>>>
>> > >>>>> * The project has just made a 0.15.0 release.
>> > >>>>> * We are discussing ways to make the Arrow libraries as
>> accessible as
>> > >>>>> possible
>> > >>>>>   to downstream projects for minimal use cases while allowing the
>> > >>>>> development
>> > >>>>>   of more comprehensive "standard libraries" with larger
>> dependency
>> > >>> stacks
>> > >>>>> in
>> > >>>>>   the project
>> > >>>>> * We plan to make a 1.0.0 release as our next major release, at
>> which
>> > >>> time
>> > >>>>> we
>> > >>>>>   will declare that the Arrow binary protocol is stable with
>> forward
>> > >>> and
>> > >>>>>   backward compatibility guarantees
>> > >>>>>
>> > >>>>> ## Community Health:
>> > >>>>>
>> > >>>>> * The community is overall healthy, with the aforementioned
>> concerns
>> > >>>>> around CI
>> > >>>>>   scalability. New contributors frequently take notice of the long
>> > >>> build
>> > >>>>> queue
>> > >>>>>   times when submitting pull requests.
>> > >>>>>
>> > >>>>> On Tue, Oct 8, 2019 at 8:58 AM Wes McKinney <wesmck...@gmail.com>
>> > >>> wrote:
>> > >>>>>>
>> > >>>>>> Yes, I agree with raising the issue to the board.
>> > >>>>>>
>> > >>>>>> On Tue, Oct 8, 2019 at 8:31 AM Antoine Pitrou <
>> anto...@python.org>
>> > >>>>> wrote:
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> I agree.  Especially given that the constraints imposed by Infra
>> > >>> don't
>> > >>>>>>> help solving the problem.
>> > >>>>>>>
>> > >>>>>>> Regards
>> > >>>>>>>
>> > >>>>>>> Antoine.
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> Le 08/10/2019 à 15:02, Uwe L. Korn a écrit :
>> > >>>>>>>> I'm not sure what qualifies for "board attention" but it seems
>> > >>> that
>> > >>>>> CI is a critical problem in Apache projects, not just Arrow.
>> Should we
>> > >>>>> raise that?
>> > >>>>>>>>
>> > >>>>>>>> Uwe
>> > >>>>>>>>
>> > >>>>>>>> On Tue, Oct 8, 2019, at 12:00 AM, Wes McKinney wrote:
>> > >>>>>>>>> Here is a start for our Q3 board report
>> > >>>>>>>>>
>> > >>>>>>>>> ## Description:
>> > >>>>>>>>> The mission of Apache Arrow is the creation and maintenance of
>> > >>>>> software related
>> > >>>>>>>>> to columnar in-memory processing and data interchange
>> > >>>>>>>>>
>> > >>>>>>>>> ## Issues:
>> > >>>>>>>>> There are no issues requiring board attention at this time
>> > >>>>>>>>>
>> > >>>>>>>>> ## Membership Data:
>> > >>>>>>>>> * Apache Arrow was founded 2016-01-19 (4 years ago)
>> > >>>>>>>>> * There are currently 48 committers and 28 PMC members in this
>> > >>>>> project.
>> > >>>>>>>>> * The Committer-to-PMC ratio is roughly 3:2.
>> > >>>>>>>>>
>> > >>>>>>>>> Community changes, past quarter:
>> > >>>>>>>>> - Micah Kornfield was added to the PMC on 2019-08-21
>> > >>>>>>>>> - Sebastien Binet was added to the PMC on 2019-08-21
>> > >>>>>>>>> - Ben Kietzman was added as committer on 2019-09-07
>> > >>>>>>>>> - David Li was added as committer on 2019-08-30
>> > >>>>>>>>> - Kenta Murata was added as committer on 2019-09-05
>> > >>>>>>>>> - Neal Richardson was added as committer on 2019-09-05
>> > >>>>>>>>> - Praveen Kumar was added as committer on 2019-07-14
>> > >>>>>>>>>
>> > >>>>>>>>> ## Project Activity:
>> > >>>>>>>>>
>> > >>>>>>>>> * The project has just made a 0.15.0 release.
>> > >>>>>>>>> * We are discussing ways to make the Arrow libraries as
>> > >>> accessible
>> > >>>>> as possible
>> > >>>>>>>>>   to downstream projects for minimal use cases while allowing
>> > >>> the
>> > >>>>> development
>> > >>>>>>>>>   of more comprehensive "standard libraries" with larger
>> > >>> dependency
>> > >>>>> stacks in
>> > >>>>>>>>>   the project
>> > >>>>>>>>> * We plan to make a 1.0.0 release as our next major release,
>> at
>> > >>>>> which time we
>> > >>>>>>>>>   will declare that the Arrow binary protocol is stable with
>> > >>>>> forward and
>> > >>>>>>>>>   backward compatibility guarantees
>> > >>>>>>>>> * We are struggling with Continuous Integration scalability as
>> > >>> the
>> > >>>>> project has
>> > >>>>>>>>>   definitely outgrown what Travis CI and Appveyor can do for
>> > >>> us. We
>> > >>>>> are
>> > >>>>>>>>>   exploring alternative solutions such as Buildbot, Buildkite
>> > >>> (see
>> > >>>>>>>>>   INFRA-19217), and GitHub Actions to provide a path to
>> migrate
>> > >>>>> away from
>> > >>>>>>>>>   Travis CI / Appveyor
>> > >>>>>>>>>
>> > >>>>>>>>> ## Community Health:
>> > >>>>>>>>>
>> > >>>>>>>>> * The community is overall healthy, with the aforementioned
>> > >>>>> concerns around CI
>> > >>>>>>>>>   scalability. New contributors frequently take notice of the
>> > >>> long
>> > >>>>> build queue
>> > >>>>>>>>>   times when submitting pull requests.
>> > >>>>>>>>>
>> > >>>>>
>> > >>>
>>
>

Reply via email to