Alan's proposal sounds like a good idea to me. +1
On Dec 18, 2012 5:36 PM, "Travis Crawford" <traviscrawf...@gmail.com> wrote: > Alan, I think your proposal sounds great. > > --travis > > On Tue, Dec 18, 2012 at 1:13 PM, Alan Gates <ga...@hortonworks.com> wrote: > > Carl, speaking just for myself and not as a representative of the HCat > PPMC at this point, I am coming to agree with you that HCat integrating > with Hive fully makes more sense. > > > > However, this makes the committer question even thornier. Travis and > Namit, I think the shepherd proposal needs to lay out a clear and time > bounded path to committership for HCat committers. Having HCat committers > as second class Hive citizens for the long run will not be healthy. I > propose the following as a starting point for discussion: > > > > All active HCat committers (those who have contributed or committed a > patch in the last 6 months) will be made committers in the HCat portion > only of Hive. In addition those committers will be assigned a particular > shepherd who is a current Hive committer and who will be responsible for > mentoring them towards full Hive committership. As a part of this > mentorship the HCat committer will review patches of other contributors, > contribute patches to Hive (both inside and outside of HCatalog), respond > to user issues on the mailing lists, etc. It is intended that as a result > of this mentorship program HCat committers can become full Hive committers > in 6-9 months. No new HCat only committers will be elected in Hive after > this. All Hive committers will automatically also have commit rights on > HCatalog. > > > > Alan. > > > > On Dec 14, 2012, at 10:05 AM, Carl Steinbach wrote: > > > >> On a functional level I don't think there is going to be much of a > >> difference between the subproject option proposed by Travis and the > other > >> option where HCatalog becomes a TLP. In both cases HCatalog and Hive > will > >> have separate committers, separate code repositories, separate release > >> cycles, and separate project roadmaps. Aside from ASF bureaucracy, I > think > >> the only major difference between the two options is that the subproject > >> route will give the rest of the community the false impression that the > two > >> projects have coordinated roadmaps and a process to prevent overlapping > >> functionality from appearing in both projects. Consequently, If these > are > >> the only two options then I would prefer that HCatalog become a TLP. > >> > >> On the other hand, I also agree with many of the sentiments that have > >> already been expressed in this thread, namely that the two projects are > >> closely related and that it would benefit the community at large if the > two > >> projects could be brought closer together. Up to this point the major > >> source of pain for the HCatalog team has been the frequent necessity of > >> making changes on both the Hive and HCatalog sides when implementing new > >> features in HCatalog. This situation is compounded by the ASF > requirement > >> that release artifacts may not depend on snapshot artifacts from other > ASF > >> projects. Furthermore, if Hive adds a dependency on HCatalog then it > will > >> be subject to these same problems (in addition to the gross circular > >> dependency!). > >> > >> I think the best way to avoid these problems is for HCatalog to become a > >> Hive submodule. In this scenario HCatalog would exist as a subdirectory > in > >> the Hive repository and would be distributed as a Hive artifact in > future > >> Hive releases. In addition to solving the problems I mentioned earlier, > I > >> think this would also help to assuage the concerns of many Hive > committers > >> who don't want to see the MetaStore split out into a separate project. > >> > >> Thanks. > >> > >> Carl > >> > >> On Thu, Dec 13, 2012 at 7:59 PM, Namit Jain <nj...@fb.com> wrote: > >> > >>> I am fine with this. Any hive committers who wants to volunteer to be > >>> a hcat shepherd is welcome. > >>> > >>> > >>> > >>> On 12/14/12 7:01 AM, "Travis Crawford" <traviscrawf...@gmail.com> > wrote: > >>> > >>>> Thanks for reviving this thread. Reviewing the comments everyone seems > >>>> to agree HCatalog makes sense as a Hive subproject. I think that's > >>>> great news for the Hadoop community. > >>>> > >>>> The discussion seems to have turned to one of committer permissions. I > >>>> agree with the Hive folks sentiment that its something that must be > >>>> earned. That said, I've found it challenging at times getting patches > >>>> into Hive that would help earn taking on a hive committer > >>>> responsibility. > >>>> > >>>> Proposal: if a couple hive committers can volunteer to be hcat > >>>> shepherds, we can work with the shepherds when making hive changes in > >>>> a timely manor. Conversely, we can help shepherd any hive committers > >>>> who are interested in working more with hcat. There are certainly > >>>> benefits to cross-committership, and this approach could help each > >>>> other build a history of meaningful contributions and earn the > >>>> privilege & responsibility of being committers. > >>>> > >>>> Thoughts? > >>>> > >>>> --travis > >>>> > >>>> > >>>> > >>>> On Thu, Dec 13, 2012 at 11:59 AM, Edward Capriolo < > edlinuxg...@gmail.com> > >>>> wrote: > >>>>> I initially was a hesitant of hcatalog mostly because I imagined we > >>>>> would > >>>>> end up in a spot very similar to this. > >>>>> > >>>>> Namely the hcatlog folks are interested in making a metastore to > support > >>>>> pig, hive, and map reduce. However I get the impression that many in > >>>>> hive > >>>>> do not care much to have a metastore that caters to everyone. Their > >>>>> needs > >>>>> are only based on what hive needs. Which I believe is the wrong way > to > >>>>> look > >>>>> at this situation. > >>>>> > >>>>> I though to reply to this thread because I have been following this > >>>>> Jira: > >>>>> https://issues.apache.org/jira/browse/HIVE-3752 > >>>>> > >>>>> On a high level I do not like this duplication of effort and code. If > >>>>> hive > >>>>> is compatible with hcatalog I do not see why we put off merging the > two > >>>>> at > >>>>> all. Hive users would get an immediate benefit if Hive used hcatalog > >>>>> with > >>>>> no apparent downside. Meanwhile we are putting this off and staying > in > >>>>> this > >>>>> awkward transition phase. > >>>>> > >>>>> Personally, I do not have a problem being a hive committer and not > >>>>> having > >>>>> hcatalog commit. None of the hive work I have done has ever touched > the > >>>>> metastore. Also of the thousands of jiras and features we have added > >>>>> only a > >>>>> small portion require metastore changes. > >>>>> > >>>>> As long as a couple active users have commit on hive and the > suggested > >>>>> hcatalog subproject I do not think not having commit will be a > >>>>> roadblock in > >>>>> moving hive forward. > >>>>> > >>>>> > >>>>> On Mon, Dec 3, 2012 at 6:22 PM, Alan Gates <ga...@hortonworks.com> > >>>>> wrote: > >>>>> > >>>>>> I am not sure where we are on this discussion. So far those who > have > >>>>>> chimed in seemed generally positive (Namit, Edward, Clark, > Alexander). > >>>>>> Namit and I have different visions for what the committership might > >>>>>> look > >>>>>> like, so I'd like to hear from other Hive PMC members what their > view > >>>>>> is on > >>>>>> this. I have to say from an HCatalog perspective the proposition is > >>>>>> much > >>>>>> less attractive without some commit rights. > >>>>>> > >>>>>> On a related note, people should be aware of these threads in the > >>>>>> Incubator list: > >>>>>> > >>>>>> > >>>>>> > >>> > http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/% > >>>>>> 3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w% > >>> 40mail.gmail.com > >>>>>> %3E > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>> > http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/% > >>>>>> 3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ% > 40mail.gmail.com > >>> %3 > >>>>>> E > >>>>>> > >>>>>> For those not inclined to read all the mails in the threads I will > >>>>>> summarize (though I urge all PMC members of Hive and PPMC members of > >>>>>> HCat > >>>>>> to read both mail threads because this is highly relevant to what we > >>>>>> are > >>>>>> discussing). There are two salient points in these threads: > >>>>>> > >>>>>> 1) It is not wise to build a subproject that is distinct from the > main > >>>>>> project in the sense that it has separate community members > interested > >>>>>> in > >>>>>> it. Bertrand, Arun, Chris Mattman, and Greg Stein all spoke against > >>>>>> this, > >>>>>> and all are long time Apache contributors with a lot of experience. > >>>>>> They > >>>>>> were all of the opinion that it was reasonable for one project to > >>>>>> release > >>>>>> separate products. > >>>>>> > >>>>>> 2) It is not wise to have committers that have access to parts of a > >>>>>> project but not others. Greg and Bertrand argued (and Arun seemed > to > >>>>>> imply) that splitting up committer lists by sections of the code did > >>>>>> not > >>>>>> work out well. > >>>>>> > >>>>>> These insights cause me to question what we mean by subproject. I > had > >>>>>> originally envisioned something that looked like Pig and Hive did > when > >>>>>> they > >>>>>> were subprojects of Hadoop. But this violates both 1 and 2 above. > >>>>>> Given > >>>>>> this input from many of the "wise old timers" of Apache I think we > >>>>>> should > >>>>>> consider what we mean when we say subproject and how tightly we are > >>>>>> willing > >>>>>> to integrate these projects. Personally I think it makes sense to > >>>>>> continue > >>>>>> to pursue integration, as I think HCat is really a set of interfaces > >>>>>> on top > >>>>>> of Hive and it makes sense to coalesce those into one project. I > guess > >>>>>> this would mean HCat becomes just another set of jars that Hive > >>>>>> releases > >>>>>> when it releases, rather than a stand alone entity. But I'm > curious to > >>>>>> hear what others think. > >>>>>> > >>>>>> Alan. > >>>>>> > >>>>>> On Nov 14, 2012, at 10:22 PM, Namit Jain wrote: > >>>>>> > >>>>>>> The same criteria should be applied to all Hive committers. Only a > >>>>>>> committer should be able to commit code. > >>>>>>> I don¹t think we should bend this rule. Metastore is not a separate > >>>>>>> project, but a integral part of hive. > >>>>>>> > >>>>>>> -namit > >>>>>>> > >>>>>>> > >>>>>>> On 11/12/12 10:32 PM, "Alan Gates" <ga...@hortonworks.com> wrote: > >>>>>>> > >>>>>>>> I would suggest looking over the patch history of HCat committers. > >>>>>> I > >>>>>>>> think most of them have already contributed a number of patches to > >>>>>> the > >>>>>>>> metastore. All are certainly aware of how to run Hive unit tests > >>>>>> and > >>>>>>>> have an understanding of how Hive works. So I don't think it's > >>>>>> fair to > >>>>>>>> say they would be unsafe with access to the metastore. And the > >>>>>> Hive PMC > >>>>>>>> is there to assure this does not happen. If there are issues I am > >>>>>> sure > >>>>>>>> they can deal with them. > >>>>>>>> > >>>>>>>> Alan. > >>>>>>>> > >>>>>>>> > >>>>>>>> On Nov 6, 2012, at 8:06 PM, Namit Jain wrote: > >>>>>>>> > >>>>>>>>> Alan, that would not be a good idea. Metastore code is part of > hive > >>>>>>>>> code, > >>>>>>>>> and it > >>>>>>>>> would be safer if only Hive committers had commit access to that. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 11/6/12 11:25 PM, "Alan Gates" <ga...@hortonworks.com> wrote: > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Nov 4, 2012, at 8:35 PM, Namit Jain wrote: > >>>>>>>>>> > >>>>>>>>>>> I like the idea of Hcatalog becoming a Hive sub-project. The > >>>>>>>>>>> enhancements/bugs in the serde/metastore areas can indirectly > >>>>>>>>>>> benefit the hive community, and it will be easier for the fix > to > >>>>>> be > >>>>>> in > >>>>>>>>>>> one > >>>>>>>>>>> place. Having said that, I don't see serde/metastore > >>>>>>>>>>> moving out of hive into a separate component. Things are tied > too > >>>>>>>>>>> closely > >>>>>>>>>>> together. I am assuming that no new committers would > >>>>>>>>>>> be automatically added to Hive as part of this, and both Hive > and > >>>>>>>>>>> HCatalog > >>>>>>>>>>> will continue to have its own committers. > >>>>>>>>>> > >>>>>>>>>> One thing in this we'd like to discuss is the HCatalog > committers > >>>>>>>>>> having > >>>>>>>>>> commit access to the metastore sections of Hive code. That > >>>>>> doesn't > >>>>>>>>>> mean > >>>>>>>>>> it has to move into HCatalog's code base. But more and more the > >>>>>> fixes > >>>>>>>>>> and changes we're doing in HCatalog are really in Hive's > >>>>>> metastore. > >>>>>> So > >>>>>>>>>> we believe it would make sense to give HCat committers access to > >>>>>> that > >>>>>>>>>> component as well as HCat. > >>>>>>>>>> > >>>>>>>>>> Alan. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> -namit > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 11/3/12 2:22 AM, "Alan Gates" <ga...@hortonworks.com> > wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hello Hive community. It is time for HCatalog to graduate > from > >>>>>> the > >>>>>>>>>>>> Apache Incubator. Given the heavy dependence of HCatalog on > >>>>>> Hive > >>>>>> the > >>>>>>>>>>>> HCatalog community agreed it made sense to explore graduating > >>>>>> from > >>>>>>>>>>>> the > >>>>>>>>>>>> Incubator to become a subproject of Hive (see > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>> > http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20120 > >>>>>>>>>>>> 9. > >>>>>>>>>>>> mb > >>>>>>>>>>>> ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com > %3E > >>>>>> and > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>> > http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20121 > >>>>>>>>>>>> 0. > >>>>>>>>>>>> mb > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>> > ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gma > >>>>>>>>>>>> il > >>>>>>>>>>>> .c > >>>>>>>>>>>> om%3E ). To help both communities understand what HCatalog is > >>>>>> and > >>>>>>>>>>>> hopes > >>>>>>>>>>>> to become we also developed a roadmap that summarizes > HCatalog's > >>>>>>>>>>>> current > >>>>>>>>>>>> features, planned features, and other possible features under > >>>>>>>>>>>> discussion: > >>>>>>>>>>>> > >>>>>> > https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap > >>>>>>>>>>>> > >>>>>>>>>>>> So we are now approaching you to see if there is agreement in > >>>>>> the > >>>>>>>>>>>> Hive > >>>>>>>>>>>> community that HCatalog graduating into Hive would make sense. > >>>>>>>>>>>> > >>>>>>>>>>>> Alan. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>> > >>> > > >