Hi Wes, to contribute an outsiders POW: while it is clear, what's expected if you'd like to make a PR, it's not at all clear to me, where would I start if I wanted to help with PR reviews without being heavily involved with the community/being a full maintainer. Should I just grab a PR, test it, comment on changes? I wouldn't be sure if I were stepping on someone's feet, tbh. So, in my view it would help if:
* there were some kind of informal reviewer assignment system, i.e. I say "I'd like to review this PR", Wes/Uwe/Antoine reply: "sure, give it a shot". This would be mentioned prominently in the contributor guide * afterwards there were some kind of feedback-to-feedback arrangement, although it would increase the work load for the existing maintainers in the short term, of course Cheers, Dimitri. On Sun, Jul 1, 2018 at 1:09 AM Donald E. Foss <donald.f...@gmail.com> wrote: > For what it's worth, this email thread and your summary writeup, Wes, are > a significant call to action on their own. > > I've been passive, not by choice, but by policy. Given the significance > and need of this project, I'll see what I can do on my side. It will be at > least a week given the US holiday. > > Donald E. Foss > > > On Jun 30, 2018, at 2:15 PM, Marco Neumann <ma...@crepererum.net.INVALID> > wrote: > > > > Hey, > > > > first of all, thanks a lot for your, Uwes, the mergers and contributors > > work. Now, to the maintainer problem: > > > > # Arrow as "a library" > > One thing that makes Arrow special is that it is not a single, but many > > libraries (one for each language) and many of them are not only a > > binding to a C/C++ lib, but partly a complete re-implementation of the > > protocol, e.g.: > > > > - C++: one core, but also contains Python specialties > > - Java: another core > > - Rust: yet another core > > - Python: a binding to C++ but also a lot more stuff because of Pandas > > ... > > > > And you two are maintaining all of them and I doubt that you have the > > capacities and knowledge to do this at the desired level of quality > > (which is natural, not a personal issue or offense). So this I would > > call "pseudo-maintenance", since you're solely the gatekeeper that does > > some shallow reviewing and has the burden to do the housekeeping and > > the merging. So why accepting these language bindings in the first > > place without bringing a core maintainer in place? For example, let's > > say someone proposes a binding to Haskell now. That should not be > > accepted as part of the official Apache implementation without a > > dedicated maintainer (ideally the PR-author would be that person, but > > there may others who step up). > > > > Right now, it might be too late to remove some of the incomplete / WIP > > implementations that don't have a core maintainer though. > > > > # GitHub > > Another special thing to consider is that Arrow is (ab)using GitHub as > > a code hosting platform. Even as a contributor, this has obvious bad > > uncool consequences: > > > > - you have yet another issue hosting system to log in > > - there is yet another information channel to keep track of (this ML > > for example, which has a semi-informative web interface telling you > > can only login using Google but does not tell you how to subscribe to > > the list) > > - links to issues don't work in the known magic way > > - you're merging the PRs by closing them; which is by all means a not > > very nice way because it does not reflect the contributors work in > > the project overview and personal profiles, but exactly this is a > > large part of the GitHub community (btw: merging PRs without using > > GitHubs merge button IS possible as bors/bors-ng proof) > > > > So as a potential maintainer, this is already a bumper, since I know > > that there are things less confortable then the system I would get from > > any normal GitHub or Gitlab project. > > > > I'm not really sure how to solve this or if it should be solved (read > > about the laziness aspect in "Contribution VS Maintenance" below) > > > > # Time / Payment > > Yes, this is indeed a big issue. From what I can tell from the open > > source projects I was involved in is that for large contributor crowds, > > you normally have full/half-time positions in place for the core > > maintainer (look at the Mozilla projects, the Blender Foundation, Gnome > > / Red Hat). So at one point I think maintaining isn't a part time / > > hobby thing anymore (w/o downgrading the hard work of Hobby- > > contributors, in contrast). I don't have a link at hand, but I recall > > some discussion about GitHub and it's importance for hiring (since it > > it acts as a CV) after MS bought it, and some of the responses are > > "doing all this work in your free time is a privilege of wealthy, > > mostly-white men", which without signing this statement in this really > > bare form already shows a problem of open source world. > > > > # Contribution VS Maintenance > > The very "nice" thing about patch/PR contribution is that you do your > > work and then you can walk away and it's the maintainers problem to > > release the artifact, upgrade/migrate your code and ensure that the > > tests you've written never break. It's comfortable. Being a maintainer > > means all the opposite things. And in the end, you get blamed for not > > supporting certain features (see the open source paragraph here https:/ > > /blog.ghost.org/5/ ) or for security disasters (remember the OpenSSL > > disaster). > > > > I think together with the previous point this means, we have to get > > companies to pay for that work, and not just dump their features to an > > OSS repo. > > > > # Path to Maintainership > > So I think (from my narrow point of view!) that many people expect that > > the path from "outsider" to "maintainer" takes the route over "a lot of > > patch/PR contributions". If I'm reading your mail right, that is not > > necessarily the case for Apache projects and I think that's great. The > > "review PRs" path sounds great, but I think GitHub or any platform I'm > > aware don't do a good job in getting people to do so. I mean, I see a > > PR and a can leave a review, but for me it is not really clear which > > consequences this have (naturally, random people don't have a veto on > > changes). So I can jump in when I think something is wrong, but I > > cannot approve a PR. This makes sense, but it poses the question of > > "how?!". I mean, it is pretty clear on how to become a patch/PR > > contributor, but it is not clear on how to become a maintainer, at > > least not in an easy way. (I'm sure it's written down somewhere). > > > > So, overall I think a clear Call for Action at the top of the README > > could help. Like "Hey, we're looking for maintainers, you could start > > by reviewing some PRs and after some reviews maintainers will just be > > the last gatekeeper and after some more time, you can even merge PRs on > > your own". > > > > # My personal contribution > > Triggered by this call for help, I'll try to get more involved in > > Python, C++ and Rust reviews. > > > > So, these are some thoughts that I hope may help. > > > > Thanks again for addressing this issue and your time and passion, > > Marco > > > >> On 2018/06/30 14:57:42, Wes McKinney <w...@gmail.com> wrote: > >> hi folks,> > >> > >> Arrow has grown by leaps and bounds over the last 2.5 years. We are> > >> approaching our 2000th patch and on track to surpass 200 unique> > >> contributors by year end.> > >> > >> All this contribution growth is great, but it has a hidden cost: > > > > the> > >> maintenance. The burden of maintaining the project: particularly> > >> reviewing and merging patches, has fallen on a very small number of> > >> people. From the commit logs, we can see how many patches each> > >> committer has merged:> > >> > >> $ git shortlog -csn > > > > d5aa7c46692474376a3c31704cfc4783c86338f2..master> > >> 1289 Wes McKinney> > >> 268 Uwe L. Korn> > >> 74 Korn, Uwe> > >> 54 Antoine Pitrou> > >> 52 Julien Le Dem> > >> 39 Philipp Moritz> > >> 18 Kouhei Sutou> > >> 18 Steven Phillips> > >> 13 Bryan Cutler> > >> 11 Jacques Nadeau> > >> 10 Phillip Cloud> > >> 8 Brian Hulette> > >> 5 Robert Nishihara> > >> 5 adeneche> > >> 4 GitHub> > >> 3 Sidd> > >> 3 siddharth> > >> 1 AbdelHakim Deneche> > >> 1 Your Name Here> > >> > >> So Uwe and I have merged ~84% of the patches in the project so far.> > >> This isn't a completely accurate reflection of the maintainer > > > > burden,> > >> since many others contribute to code reviews and other aspects of> > >> patch maintenance, and you have to be a committer to earn a place > > > > on> > >> this list.> > >> > >> I'm not sure what's the best way to address this problem. The > > > > quality> > >> of our code review has declined at times as we struggle to keep up> > >> with the flow of patches -- I don't think this is good. Having the> > >> patch queue pile up isn't great either. Personally, I'm having a> > >> difficult time balancing project maintenance and patch authoring,> > >> particularly in the last 6 months.> > >> > >> Unfortunately, many people believe that writing patches is the > > > > primary> > >> mode of contribution to an open source project. Apache projects> > >> explicitly state that non-patch contributions are valued in earning> > >> karma (committership and PMC membership). We're starting to have > > > > more> > >> corporate contributors come out of the woodwork, and while it's > > > > great> > >> for contributors to be paid to write patches for the project, they > > > > are> > >> rarely given the time and space to contribute meaningfully to> > >> maintenance.> > >> > >> Any thoughts about how we can grow the maintainership? Somehow we > > > > need> > >> to reach ~5-6 core maintainers over the next year.> > >> > >> Thanks,> > >> Wes> >