+1 (binding) I see this as a way to increase transparency and efficiency around a process that already informally exists, with benefits to both new contributors and committers. For new contributors, it makes clear who they should ping about a pending patch. For committers, it's a good reference for who to rope in if they're reviewing a change that touches code they're unfamiliar with. I've often found myself in that situation when doing a review; for me, having this list would be quite helpful.
-Kay On Thu, Nov 6, 2014 at 10:00 AM, Josh Rosen <rosenvi...@gmail.com> wrote: > +1 (binding). > > (our pull request browsing tool is open-source, by the way; contributions > welcome: https://github.com/databricks/spark-pr-dashboard) > > On Thu, Nov 6, 2014 at 9:28 AM, Nick Pentreath <nick.pentre...@gmail.com> > wrote: > > > +1 (binding) > > > > — > > Sent from Mailbox > > > > On Thu, Nov 6, 2014 at 6:52 PM, Debasish Das <debasish.da...@gmail.com> > > wrote: > > > > > +1 > > > The app to track PRs based on component is a great idea... > > > On Thu, Nov 6, 2014 at 8:47 AM, Sean McNamara < > > sean.mcnam...@webtrends.com> > > > wrote: > > >> +1 > > >> > > >> Sean > > >> > > >> On Nov 5, 2014, at 6:32 PM, Matei Zaharia <matei.zaha...@gmail.com> > > wrote: > > >> > > >> > Hi all, > > >> > > > >> > I wanted to share a discussion we've been having on the PMC list, as > > >> well as call for an official vote on it on a public list. Basically, > as > > the > > >> Spark project scales up, we need to define a model to make sure there > is > > >> still great oversight of key components (in particular internal > > >> architecture and public APIs), and to this end I've proposed > > implementing a > > >> maintainer model for some of these components, similar to other large > > >> projects. > > >> > > > >> > As background on this, Spark has grown a lot since joining Apache. > > We've > > >> had over 80 contributors/month for the past 3 months, which I believe > > makes > > >> us the most active project in contributors/month at Apache, as well as > > over > > >> 500 patches/month. The codebase has also grown significantly, with new > > >> libraries for SQL, ML, graphs and more. > > >> > > > >> > In this kind of large project, one common way to scale development > is > > to > > >> assign "maintainers" to oversee key components, where each patch to > that > > >> component needs to get sign-off from at least one of its maintainers. > > Most > > >> existing large projects do this -- at Apache, some large ones with > this > > >> model are CloudStack (the second-most active project overall), > > Subversion, > > >> and Kafka, and other examples include Linux and Python. This is also > > >> by-and-large how Spark operates today -- most components have a > de-facto > > >> maintainer. > > >> > > > >> > IMO, adopting this model would have two benefits: > > >> > > > >> > 1) Consistent oversight of design for that component, especially > > >> regarding architecture and API. This process would ensure that the > > >> component's maintainers see all proposed changes and consider them to > > fit > > >> together in a good way. > > >> > > > >> > 2) More structure for new contributors and committers -- in > > particular, > > >> it would be easy to look up who’s responsible for each module and ask > > them > > >> for reviews, etc, rather than having patches slip between the cracks. > > >> > > > >> > We'd like to start with in a light-weight manner, where the model > only > > >> applies to certain key components (e.g. scheduler, shuffle) and > > user-facing > > >> APIs (MLlib, GraphX, etc). Over time, as the project grows, we can > > expand > > >> it if we deem it useful. The specific mechanics would be as follows: > > >> > > > >> > - Some components in Spark will have maintainers assigned to them, > > where > > >> one of the maintainers needs to sign off on each patch to the > component. > > >> > - Each component with maintainers will have at least 2 maintainers. > > >> > - Maintainers will be assigned from the most active and > knowledgeable > > >> committers on that component by the PMC. The PMC can vote to add / > > remove > > >> maintainers, and maintained components, through consensus. > > >> > - Maintainers are expected to be active in responding to patches for > > >> their components, though they do not need to be the main reviewers for > > them > > >> (e.g. they might just sign off on architecture / API). To prevent > > inactive > > >> maintainers from blocking the project, if a maintainer isn't > responding > > in > > >> a reasonable time period (say 2 weeks), other committers can merge the > > >> patch, and the PMC will want to discuss adding another maintainer. > > >> > > > >> > If you'd like to see examples for this model, check out the > following > > >> projects: > > >> > - CloudStack: > > >> > > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Maintainers+Guide > > >> < > > >> > > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Maintainers+Guide > > >> > > > >> > - Subversion: > > >> https://subversion.apache.org/docs/community-guide/roles.html < > > >> https://subversion.apache.org/docs/community-guide/roles.html> > > >> > > > >> > Finally, I wanted to list our current proposal for initial > components > > >> and maintainers. It would be good to get feedback on other components > we > > >> might add, but please note that personnel discussions (e.g. "I don't > > think > > >> Matei should maintain *that* component) should only happen on the > > private > > >> list. The initial components were chosen to include all public APIs > and > > the > > >> main core components, and the maintainers were chosen from the most > > active > > >> contributors to those modules. > > >> > > > >> > - Spark core public API: Matei, Patrick, Reynold > > >> > - Job scheduler: Matei, Kay, Patrick > > >> > - Shuffle and network: Reynold, Aaron, Matei > > >> > - Block manager: Reynold, Aaron > > >> > - YARN: Tom, Andrew Or > > >> > - Python: Josh, Matei > > >> > - MLlib: Xiangrui, Matei > > >> > - SQL: Michael, Reynold > > >> > - Streaming: TD, Matei > > >> > - GraphX: Ankur, Joey, Reynold > > >> > > > >> > I'd like to formally call a [VOTE] on this model, to last 72 hours. > > The > > >> [VOTE] will end on Nov 8, 2014 at 6 PM PST. > > >> > > > >> > Matei > > >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > >> For additional commands, e-mail: dev-h...@spark.apache.org > > >> > > >> > > >