On Mon, Jul 3, 2017 at 6:20 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> > We already have things in the meta-store not directly tied to language > features. For example hive metastore has a "retention" property which is > not actively in use by anything. In reality, we rarely say 'no' or -1 to > much. Which in part is why I believe our release process is grinding > slower: we have so many things in flight I do not feel that any one person > can keep track. You are working on porting the metastore to hbase. > https://issues.apache.org/jira/browse/HIVE-9452 did you get a -1 or 'No' > along the way? When I first noticed this I pointed out that someone has > already ported the metastore to Cassandra > https://github.com/riptano/brisk/blob/master/src/java/ > src/org/apache/cassandra/hadoop/hive/metastore/SchemaManagerService.java, > but I was more exciting/rational for this multi-year approach using hbase > so I let everyone 'have at it'. > Your example and mine are not equivalent. The HBase metastore is still a Hive feature, even if some thought it not worth while. That is different than people bringing features that will never interest Hive or that Hive could never use (e.g. Dain’s desire for the metastore to support Presto style views). I forgot to mention the issue these would be non-Hive contributors have with releases if they contribute their features to the metastore while it’s inside Hive. Is Hive going to do a release just to push out features in the metastore that it doesn’t care about? You seem to be asserting that doing this doesn’t really help non-Hive based systems that are using or would like to use the metastore. But it is interesting that people from three of those systems have commented in the thread so far, and all are positive (Dmitrias from Impala, Dain from Presto, and Sriharsha from the schema registry project). > I am going to give a hypothetical but real world situation. Suppose I want > to add the statement "CREATE permanent macro xyz", this feature I believe > would cross cut calcite, hive, and hive metastore. To build this feature I > would need to orchestrate the change across 3 separate groups of hive > 'subcommittees' for lack of a better word. 3 git repos, 3 Jira's 3 > releases. That is not counting if we run into some bug or misfeature (maybe > with Tez or something else) so that brings in 4-5 releases of upstream to > add a feature to hive. This does not take into account normal processes > mess ups. For example say you get the metastore done, but now the people > doing the calcite/antlr suggest the feature have different syntax because > they did not read the 3-4 linked tickets when the process started? Now, you > have to loop back around the process. Finding 1 person in 1 project to > usher along the feature you want is difficult, having to find and clear > time with 3 people across three projects is going to be a difficult along > with then 'pushing' them all to kick out a release so you can finally use > said feature. > I partially agree with you. On the reviews, JIRAs, etc. I don’t think it adds much, if any, overhead. Hive is a big project and no one person knows all the code anymore. If you wanted to add a permanent macros feature you would need reviews from someone who knows the parser (probably Pengcheng), people who know the optimizer (Jesus, Ashutosh, …), and someone who knows the metastore (me, Thejas, …). And any large feature is going to be implemented over multiple JIRAs, all of which are linkable regardless of whether the JIRAs start with METASTORE- or HIVE-. I also don’t think it makes the feature disagreement any worse. If the optimizer team absolutely insists it has to have some feature and the metastore team insists that it can’t have that feature you’re going to have to work through the issue whether they all are in Hive or in two separate projects. Where I agree the split adds cost is releases. Before your macro feature could go live you need releases from each of the components. And while in development the components need to use snapshot versions of the other components. My assertion is that the benefits out weigh this cost. Alan.