Re: [DISCUSS] [VOTE] HCatalog to Graduate and become part of Apache Hive

2013-02-24 Thread Ross Gardler
Thanks again Alan.

The fact that there appears to be no members of the Hive PMC active in the
incubator would seem to be the root cause of the need for further
mentoring/incubation post graduation.

Ross

On 23 Feb 2013 17:07, "Alan Gates"  wrote:
>
> I think that's a question for the Hive PMC.  I have some guesses, but it
seemd more appropriate to let them speak for themselves.  For some history
you can take a look at
http://mail-archives.apache.org/mod_mbox/hive-dev/201211.mbox/%3CC648B9DE-2088-465E-8FA1-590D5E192093%40hortonworks.com%3Ewhich
is the initial discussion between HCat and Hive.  AFAIK none of the
Hive PMC are on the IPMC, so you may need to mail private@hive or
dev@hiveto get their feedback.
>
> Alan.
>
> On Feb 23, 2013, at 2:43 AM, Ross Gardler wrote:
>
> > Thanks Alan,
> >
> > I'm still wondering why the Hive PMC feel mentoring inside the PMC is
> > appropriate but not the IPMC.
> >
> > Please understand I'm not saying I'm for or against the proposal. I'm
> > trying to understand it so that I can form an opinion as an IPMC member.
> >
> > Ross
> >
> > Sent from a mobile device, please excuse mistakes and brevity
> > On 20 Feb 2013 18:22, "Alan Gates"  wrote:
> >
> >> The project was named Howl when it was proposed, so the proposal is at
> >> http://wiki.apache.org/incubator/HowlProposal
> >>
> >> Alan.
> >>
> >> On Feb 20, 2013, at 2:31 AM, Ross Gardler wrote:
> >>
> >>> I'm brought to this thread byt he board report but my response here
is as
> >>> an IPMC member. My comment on the board report is quite different, it
is
> >>> "I've read the thread on general@ and feel that the IPMC should make a
> >>> clear recommendation to the board in this and similar cases. The IPMC
> >>> discussion seems to be healthy and productive."
> >>>
> >>> So, as a an IPMC member I have a few open questions [inline]...
> >>>
> >>>
> >>> On 11 February 2013 18:20, Alan Gates  wrote:
> >>>
> 
> 
> >>> Also, it has been agreed that each HCatalog committer will be provided
> >> with
>  a mentor from the Hive community to help him/her learn the rest of
Hive,
>  with the goal of becoming a committer on Hive within six months.  The
>  submodule state is transitionary, not an end point.
> 
> 
> >>> Why was this"mentoring" not done as part of the incubation process
since
> >>> building the right community structure for graduation (along with IP
> >>> clearance) is the main role of the incubation process? Was Hive the
> >>> sponsoring project for this proposal? If not why not?
> >>>
> >>> I ask these questions because HCatlog is making a very strong case
that
> >> any
> >>> other option for graduation is not appropriate. At the same time we
are
> >>> being told by the Hive PMC that the mentoring of the committers is
> >>> incomplete since they have insufficient merit within Hive to be
trusted
> >> to
> >>> be full members of that project.
> >>>
> >>> it also concerns me that in this same month the IPMC board report says
> >> "The
> >>> main concern of the incubator continues to be the quality and
reliability
> >>> of supervision... The supply of mentoring seems, still, to exceed
> >> demand."
> >>>
> >>> Why is it that the Hive PMC feels it is able to provide "mentoring"
> >> within
> >>> their own PMC through the creation of what some people see as
> >>> an umbrella project, but not here in the IPMC?
> >>>
> >>> Finally, why can't I find the HCatalog proposal in my mail client,
> >> markmail
> >>> or the wiki (not had coffee yet, feel free to call me [insert
adjective])
> >>>
> >>> Ross
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>


Re: [VOTE] Accept Tez into Incubator

2013-02-24 Thread Arun C Murthy
Thanks to all who voted. Obviously, I'm +1 (binding) on the proposal.

With 14 +1s (10 binding) the vote passes.

I'll start the work to get the podling started.

thanks,
Arun

On Feb 19, 2013, at 8:26 PM, Arun C Murthy wrote:

> Hi Folks,
> 
> Thanks for participating in the discussion. I'd like to call a VOTE for 
> acceptance of Apache Tez into the Incubator. I'll let the vote run till into 
> this weekend (Sun 2/24 6pm PST).
> 
> [ ]  +1 Accept Apache Tez into the Incubator
> [ ]  +0 Don't care.
> [ ]  -1 Don't accept Apache Tez into the Incubator because...
> 
> Full proposal is pasted at the bottom of this email, and the corresponding 
> wiki is http://wiki.apache.org/incubator/TezProposal. 
> 
> Only VOTEs from Incubator PMC members are binding, but all are welcome to 
> express their thoughts.
> 
> Here's my +1 (binding).
> 
> thanks,
> Arun
> 
> PS: From the initial discussion, the only changes are that I've added one new 
> mentor and 2 new committers. All the new additions come from the non-major 
> employer while we continue to strive to further diversify during the 
> incubation. Thanks.
> 
> 
> 
> = Tez =
> 
> == Abstract ==
> Tez is an effort to develop a generic application framework which can be used
> to process arbitrarily complex data-processing tasks and also a re-usable set
> of data-processing primitives which can be used by other projects.
> 
> == Proposal ==
> Tez is a proposal to develop a generic application which can be used to
> process complex data-processing task DAGs and runs natively on Apache Hadoop 
> YARN. YARN is a generic resource-management system on which currently 
> applications like MapReduce already exist. MapReduce is a specific, and
> constrained, DAG - which is not optimal for several frameworks like Apache 
> Hive
> and Apache Pig. Furthermore, we propose to develop a re-usable set of
> libraries of data-processing primitives such as sorting, merging,
> data-shuffling, intermediate data management etc. which are necessary for Tez 
> which we envision can be used directly by other projects. 
> 
> == Background ==
> Apache Hadoop MapReduce has emerged as the assembly-language on which other
> frameworks like Apache Pig and Apache Hive have been built. However, it has
> been well accepted that MapReduce produces very constrained task DAGs for each
> job which results in Apache Pig and Apache Hive requiring multiple MapReduce
> jobs for several queries. By providing a more expressive DAG of tasks for a
> job, Tez attempts to provide significantly enhanced data-processing
> capabilities for projects like Apache Pig, Apache Hive, Cascading etc.
> 
> == Rationale ==
> There is an important gap that Tez fulfills in the Apache Hadoop ecosystem of
> allowing for more expressive task DAGs for data-processing applications such
> as Apache Pig, Apache Hive, Cascading etc.
> 
> With emergence of Apache Hadoop YARN, there is a strong need for a
> common DAG application which can then be shared by Apache Pig, Apache Hive,
> Cascading etc.
> 
> == Initial Goals ==
> The initial goals for this project are to specify the detailed requirements
> and architecture, and then develop the initial implementation including the
> DAG ApplicationMaster to run natively inside Apache Hadoop YARN. 
> 
> == Current Status ==
> Significant work has been completed to identify the initial requirements and
> define the overall system architecture. There is a patch available in the
> internal Hortonworks git repository which can act as the initial seed. 
> 
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the 
> requirements 
> in an open forum. Several companies have already expressed interest in this 
> project, and we intend to invite additional developers to participate. 
> We will encourage and monitor community participation so that privileges can 
> be 
> extended to those that contribute. 
> 
> === Community ===
> The need for a generic DAG application for data processing in the open source 
> is 
> tremendous, so there is a potential for a very large community. We believe
> that Tez's extensible architecture will further encourage community 
> participation. 
> Also, related Apache projects (eg, Pig, Hive) have very large and active 
> communities, and we expect that over time Tez will also attract a large 
> community.
> 
> === Core Developers ===
> The developers on the initial committers list include people very experienced
> in the Apache Hadoop ecosystem:
> 
>  * Alan Gates 
>  * Arun C Murthy 
>  * Ashutosh Chauhan 
>  * Bikas Saha 
>  * Chris Douglas 
>  * Daryn Sharp 
>  * Devaraj Das 
>  * Gopal Vijayaraghavan 
>  * Gunther Hagleitner 
>  * Hitesh Shah 
>  * Jason Lowe 
>  * Jean Xu 
>  * Jitendra Pandey 
>  * Julien Le Dem 
>  * Kevin Wilfong 
>  * Mike Liddell 
>  * Namit Jain 
>  * Nathan Roberts 
>  * Owen O'Malley 
>  * Robert Evans 
>  * Siddharth Seth 
>  * Tom White 
>  * Thomas Graves 
>  * Vikram Dixit 
>  * Vinod Kumar Vavilap