To Lefty's comment -  Yes, anyone can take Apache code and make another
project at will. However, for changes made to an existing project as part
of that process, such as what Owen described for ORC in Hive, it is
certainly something that Hive PMC can control or vote on. Nevertheless,
that's not my immediate concern.

To Owen's explanation - Thanks. I guess my major concern is that we
seemingly are breaking apart Hive's integrity and making it hard to release
and maintain due to increasing number of external dependents. Let's say
that Hive depends on a certain version of ORC (as TLP) and it's found that
ORC has a bug that seriously impacts Hive users. We cannot release Hive as
fast as we can, since dong so would need ORC community to fix the problem
and make a release, for which Hive PMC has no control. On the contrary,
Hive community can quickly fix the problem and make a release without
waiting for other projects to make a release. I'm not sure this move (ORC
as TLP) will be beneficial to vast Hive users.

If this not convincing, let me propose that we spin off metastore also as
TLP tomorrow!

Thanks,
Xuefu


On Wed, Apr 8, 2015 at 8:33 AM, Owen O'Malley <omal...@apache.org> wrote:

> On Tue, Apr 7, 2015 at 8:49 PM, Xuefu Zhang <xzh...@cloudera.com> wrote:
>
> > If I understood Allen's #2 comment, we are moving existing ORC code out
> of
> > Hive and make it a separate project, which I definitely missed.
> >
>
> I'm sorry that wasn't clear. Yes, most of the code that is currently in
> org.apache.hadoop.hive.ql.io.orc will move to the new project.
>
> The biggest change on the Hive side will be to create a new Hive module
> that defines the API that storage formats like ORC need to code against if
> they want high performance integration with Hive's vectorization. I've
> started that jira at https://issues.apache.org/jira/browse/HIVE-10171 .
> Creating this API should help us create a clean interface for storage
> formats that will help ORC and other columnar formats like Trevni or
> Parquet.
>
> Once the ORC project has made its first release, we can create a Hive jira
> to replace the Hive ORC code with a reference to the ORC release jar.
>
>
> > Since existing Hive PMC has governance on the code, I would expect it's
> > still the case even after the spinoff.
> >
>
> No, Apache doesn't allow umbrella projects where one PMC controls
> sub-projects. The reason is that the Apache board has found that
> controlling projects directly instead of indirectly through another PMC
> reduces the problems.
>
> .. Owen
>

Reply via email to