For the record, here is the ticket for Azure Pipelines integration: https://issues.apache.org/jira/browse/INFRA-17030
I opened an issue back in May about the Travis-CI capacity situation: https://issues.apache.org/jira/browse/INFRA-18533 Apparently CI capacity has been a "hot topic as of late": https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E (I didn't know this list -- bui...@apache.org -- existed, by the way) Regards Antoine. Le 10/10/2019 à 07:34, Wes McKinney a écrit : > On Thu, Oct 10, 2019 at 12:22 AM Jacques Nadeau <jacq...@apache.org> wrote: >> >> I'm not dismissing the there are issues but I also don't feel like there >> has been constant discussion for months on the list that INFRA is not being >> responsive to Arrow community requests. It seems like you might be saying a >> couple different things one of two things (or both?)? >> >> 1) The Arrow infrastructure requirements are vastly different than other >> projects. Because of Arrow's specialized requirements, we need things that >> no other project needs. >> 2) There are many projects that want CircleCI, Buildkite and Azure >> pipelines but Infrastructure is not responsive. This is putting a big >> damper on the success of the Arrow project. > > Yes, I'm saying both of these things. > > 1. Yes, Arrow is special -- validating the project requires running a > dozen or more different builds (with dozens more nightly builds) that > test different parts of the project. Different language components, a > large and diverse packaging matrix, and interproject integration tests > and integration with external projects (e.g. Apache Spark adn others) > > 2. Yes, the limited GitHub App availability is hurting us. > > I'm OK to place this concern in the "Community Health" section and > spend more time building a comprehensive case about how Infra's > conservatism around Apps is causing us to work with one hand tied > behind our back. I know that I'm not the only one who is unhappy, but > I'll let the others speak for themselves. > >> For each of these, if we're asking the board to do something, we should say >> more and more clearly. Sure, CI is a pain in the Arrow project's a**. I >> also agree that community health is impacted by the challenge to merge >> things. I also share the perspective that the foundation has been slow to >> adopt new technologies and has been way to religious about svn. However, If >> we're asking the board to do something, what is it? > > Allow GitHub Apps that do not require write access to the code itself, > set up appropriate checks and balances to ensure that the Foundation's > IP provenance webhooks are preserved. > >> Looking at the two things you might be saying... >> If 1, are we confident in that? Many other projects have pretty complex >> build matrices I think. (I haven't thought about this and evaluated the >> other projects...maybe it is true.) If 1, we should clarify why we think >> we're different. If that is the case, what are asking for from the board. >> >> If 2, and you are proposing throwing stones at INFRA, we should back it up >> with INFRA tickets and numbers (e.g. how many projects have wanted these >> things and for how long). We should reference multiple threads on the INFRA >> mailing list where we voiced certain concerns and many other people voiced >> similar concerns and INFRA turned a deaf ear or blind eye (maybe these >> exist, I haven't spent much time on the INFRA list lately). As it stands, >> the one ticket referenced in this thread is a ticket that has only one >> project asking for a new integration that has been open for less than a >> week. That may be annoying but it doesn't seem like something that has >> gotten to the level that we need to get the boards help. >> >> In a nutshell, I agree that this is impacting the health and growth of the >> project but think we should cover that in the community health section of >> the report. I'm less a fan of saying this is an issue the board needs to >> help us solve unless it has been a constant point of pain that we've >> attempted to elevate multiple times in infra forums and experienced >> unreasonable responses. The board is a blunt instrument and should only be >> used when we have depleted every other avenue for resolution. >> > > Yes, I'm happy to spend more time building a comprehensive case before > escalating it to the board level. However, Apache Arrow is a high > profile project and it is not a good luck to have a PMC in a > fast-growing project growing disgruntled with the Foundation's > policies in this way. We've been struggling visibly for a long time > with our CI scalability, and I think we should have all the options on > the table to utilize GitHub-integrated tools to help us find a way out > of the mess that we are in. > >> >> On Wed, Oct 9, 2019 at 9:44 PM Wes McKinney <wesmck...@gmail.com> wrote: >> >>> hi Jacques, >>> >>> I think we need to share the concerns that many PMC members have over >>> the constraints that INFRA is placing on us. Can we rephrase the >>> concern in a way that is more helpful? >>> >>> Firstly, I respect and appreciate the ASF's desire to limit write >>> access to committers only from an IP provenance perspective. I >>> understand that GitHub webhooks are used to log actions taken in >>> repositories to secure IP provenance. I do not think a third party >>> application should be given the ability to commit or modify a >>> repository -- all write operations on the .git repository should be >>> initiated by committers. >>> >>> However, GitHub is the main platform for producing open source >>> software, and tools are being created to help produce open source more >>> efficiently. It is frustrating for us to not be able to take advantage >>> of the tools that are available to everyone else on GitHub. I brought >>> up the recent request about Buildkite as being representative of this >>> (after learning that Google has been making a lot of use of it), but >>> we have previously been denied use of CircleCI and Azure Pipelines >>> since those services require even more permissions (AFAIK) than in the >>> case of Buildkite. From our use in >>> https://github.com/ursa-labs/crossbow CircleCI and Azure seem to be a >>> lot better than Travis CI and Appveyor >>> >>> I think the ASF is going to face an existential crisis in the near >>> future whether it wants to live in 2020 or 2000. It feels like GitHub >>> is treated somewhat as ersatz SVN "because people want to use git + >>> GitHub instead of SVN" >>> >>> In the same way that the cloud revolutionized software startups, >>> enabling small groups of developers to build large SaaS applications, >>> the same kind of leverage is becoming available to open source >>> developers to set up infrastructure to automate and scale open source >>> projects. I think projects considering joining the Foundation are >>> going to look at these issues around App usage and decide that they >>> would rather be in control of their own infrastructure. >>> >>> I can set aside even more time and money from my non-profit >>> organization's modest budget to do CI work for Apache Arrow. The >>> amount that we have invested already is very large, and continues to >>> grow. I'm raising these issues because as Member of the Foundation I'm >>> concerned that fast-growing projects like ours are not being >>> adequately served by INFRA, and we probably aren't the only project >>> that will face these issues. All that is needed is for INFRA to let us >>> use third party GitHub Apps and monitor any potentially destructive >>> actions that they may take, such as modifying unrelated repository >>> webhooks related to IP provenance. >>> >>> - Wes >>> >>> On Wed, Oct 9, 2019 at 9:33 PM Jacques Nadeau <jacq...@apache.org> wrote: >>>> >>>> I think we need to more direct in listing issues for the board. >>>> >>>> What have we done? What do we want them to do? >>>> >>>> In general, any large org is going to be slow to add new deep >>> integrations >>>> into GitHub. I don't think we should expect Apache to be any different >>> (it >>>> took several years before we could merge things through github for >>>> example). If I were on the INFRA side, I think I would look and see how >>>> many different people are asking for BuildKite before considering >>>> integration. It seems like we only opened the JIRA 6 days ago and no >>> other >>>> projects have requested access to this? >>>> >>>> I'm not clear why this is a board issue. What do we think the board can >>> do >>>> for us that we can't solve ourselves and need them to solve? Remember, a >>>> board solution to a problem is typically very removed from what matters >>> to >>>> individuals on a project. >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Oct 8, 2019 at 7:03 AM Wes McKinney <wesmck...@gmail.com> wrote: >>>> >>>>> New draft >>>>> >>>>> ## Description: >>>>> The mission of Apache Arrow is the creation and maintenance of software >>>>> related >>>>> to columnar in-memory processing and data interchange >>>>> >>>>> ## Issues: >>>>> >>>>> * We are struggling with Continuous Integration scalability as the >>> project >>>>> has >>>>> definitely outgrown what Travis CI and Appveyor can do for us. Some >>>>> contributors have shown reluctance to submit patches they aren't sure >>>>> about >>>>> because they don't want to pile on the build queue. We are exploring >>>>> alternative solutions such as Buildbot, Buildkite, and GitHub >>> Actions to >>>>> provide a path to migrate away from Travis CI / Appveyor. In our >>> request >>>>> to >>>>> Infrastructure INFRA-19217, some of us were alarmed to find that an >>> CI/CD >>>>> service like Buildkite may not be able to be connected to the @apache >>>>> GitHub >>>>> account on account of requiring admin access to repository webhooks, >>> but >>>>> no >>>>> ability to modify source code. There are workarounds (building custom >>>>> OAuth >>>>> bots) that could enable us to use Buildkite, but it would require >>> extra >>>>> development and result in a less refined experience for community >>>>> members. >>>>> >>>>> ## Membership Data: >>>>> * Apache Arrow was founded 2016-01-19 (4 years ago) >>>>> * There are currently 48 committers and 28 PMC members in this project. >>>>> * The Committer-to-PMC ratio is roughly 3:2. >>>>> >>>>> Community changes, past quarter: >>>>> - Micah Kornfield was added to the PMC on 2019-08-21 >>>>> - Sebastien Binet was added to the PMC on 2019-08-21 >>>>> - Ben Kietzman was added as committer on 2019-09-07 >>>>> - David Li was added as committer on 2019-08-30 >>>>> - Kenta Murata was added as committer on 2019-09-05 >>>>> - Neal Richardson was added as committer on 2019-09-05 >>>>> - Praveen Kumar was added as committer on 2019-07-14 >>>>> >>>>> ## Project Activity: >>>>> >>>>> * The project has just made a 0.15.0 release. >>>>> * We are discussing ways to make the Arrow libraries as accessible as >>>>> possible >>>>> to downstream projects for minimal use cases while allowing the >>>>> development >>>>> of more comprehensive "standard libraries" with larger dependency >>> stacks >>>>> in >>>>> the project >>>>> * We plan to make a 1.0.0 release as our next major release, at which >>> time >>>>> we >>>>> will declare that the Arrow binary protocol is stable with forward >>> and >>>>> backward compatibility guarantees >>>>> >>>>> ## Community Health: >>>>> >>>>> * The community is overall healthy, with the aforementioned concerns >>>>> around CI >>>>> scalability. New contributors frequently take notice of the long >>> build >>>>> queue >>>>> times when submitting pull requests. >>>>> >>>>> On Tue, Oct 8, 2019 at 8:58 AM Wes McKinney <wesmck...@gmail.com> >>> wrote: >>>>>> >>>>>> Yes, I agree with raising the issue to the board. >>>>>> >>>>>> On Tue, Oct 8, 2019 at 8:31 AM Antoine Pitrou <anto...@python.org> >>>>> wrote: >>>>>>> >>>>>>> >>>>>>> I agree. Especially given that the constraints imposed by Infra >>> don't >>>>>>> help solving the problem. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Antoine. >>>>>>> >>>>>>> >>>>>>> Le 08/10/2019 à 15:02, Uwe L. Korn a écrit : >>>>>>>> I'm not sure what qualifies for "board attention" but it seems >>> that >>>>> CI is a critical problem in Apache projects, not just Arrow. Should we >>>>> raise that? >>>>>>>> >>>>>>>> Uwe >>>>>>>> >>>>>>>> On Tue, Oct 8, 2019, at 12:00 AM, Wes McKinney wrote: >>>>>>>>> Here is a start for our Q3 board report >>>>>>>>> >>>>>>>>> ## Description: >>>>>>>>> The mission of Apache Arrow is the creation and maintenance of >>>>> software related >>>>>>>>> to columnar in-memory processing and data interchange >>>>>>>>> >>>>>>>>> ## Issues: >>>>>>>>> There are no issues requiring board attention at this time >>>>>>>>> >>>>>>>>> ## Membership Data: >>>>>>>>> * Apache Arrow was founded 2016-01-19 (4 years ago) >>>>>>>>> * There are currently 48 committers and 28 PMC members in this >>>>> project. >>>>>>>>> * The Committer-to-PMC ratio is roughly 3:2. >>>>>>>>> >>>>>>>>> Community changes, past quarter: >>>>>>>>> - Micah Kornfield was added to the PMC on 2019-08-21 >>>>>>>>> - Sebastien Binet was added to the PMC on 2019-08-21 >>>>>>>>> - Ben Kietzman was added as committer on 2019-09-07 >>>>>>>>> - David Li was added as committer on 2019-08-30 >>>>>>>>> - Kenta Murata was added as committer on 2019-09-05 >>>>>>>>> - Neal Richardson was added as committer on 2019-09-05 >>>>>>>>> - Praveen Kumar was added as committer on 2019-07-14 >>>>>>>>> >>>>>>>>> ## Project Activity: >>>>>>>>> >>>>>>>>> * The project has just made a 0.15.0 release. >>>>>>>>> * We are discussing ways to make the Arrow libraries as >>> accessible >>>>> as possible >>>>>>>>> to downstream projects for minimal use cases while allowing >>> the >>>>> development >>>>>>>>> of more comprehensive "standard libraries" with larger >>> dependency >>>>> stacks in >>>>>>>>> the project >>>>>>>>> * We plan to make a 1.0.0 release as our next major release, at >>>>> which time we >>>>>>>>> will declare that the Arrow binary protocol is stable with >>>>> forward and >>>>>>>>> backward compatibility guarantees >>>>>>>>> * We are struggling with Continuous Integration scalability as >>> the >>>>> project has >>>>>>>>> definitely outgrown what Travis CI and Appveyor can do for >>> us. We >>>>> are >>>>>>>>> exploring alternative solutions such as Buildbot, Buildkite >>> (see >>>>>>>>> INFRA-19217), and GitHub Actions to provide a path to migrate >>>>> away from >>>>>>>>> Travis CI / Appveyor >>>>>>>>> >>>>>>>>> ## Community Health: >>>>>>>>> >>>>>>>>> * The community is overall healthy, with the aforementioned >>>>> concerns around CI >>>>>>>>> scalability. New contributors frequently take notice of the >>> long >>>>> build queue >>>>>>>>> times when submitting pull requests. >>>>>>>>> >>>>> >>>