Okay. That's fine. I think supporting an env variable doesn't take much. What about enabling the new code path by default, and allowing user to opt-out or in case of a serious bug? We also give user an warning that the env variable may be discontinued in the future.
thanks, Xuefu On Thu, Apr 30, 2015 at 5:13 PM, Thejas Nair <thejas.n...@gmail.com> wrote: > In most cases with hive, when a major implementation change is made, > we usually provide the user to fallback to older implementation. For > example, when CBO was added, it was initially not enabled by default, > and there still option of using non-CBO path. When new hadoop major > versions are added, we still give users option of using older hadoop > versions for some time. Or in case of jdbc, we allowed users to choose > between HiveServer1 and 2 for sometime. Even with putting good effort > into testing, some corner cases sometimes get missed. > > On similar lines, it would be good to let opt-in for a release, and > then switch the default in the next release. Given that we have been > making new releases of hive every few months, I don't see this as a > big issue. I think we should at the minimum allow users to opt out of > new implementation for a release or so (if they encounter bugs). > > Most of the work is going to be in ensuring the compatibility. > Supporting a flag to choose implementation should be relatively > simpler work. What do you think ? > > > > > > > > > > On Thu, Apr 30, 2015 at 4:42 PM, Xuefu Zhang <xzh...@cloudera.com> wrote: > > Hi Thejas, > > > > Thanks for your input. I thought about this, but I don't really feel it > > necessary to have a "transition" stage. After all, Hive CLI is a command > > line tool with well-defined command line options. That's the "interface" > > that we need to support. We are just changing the implementation. Through > > comprehensive testing, we hope to discover most of the issues. > > > > On the other hand, if we have such an transition, there might never be a > > user bothering to flip the env variable and the transition doesn't really > > build up more confidence. > > > > In addition, if we provide either a transition or switch for every > > implementation change, wouldn't users be overwhelmed by those transitions > > or switches. > > > > Thoughts? > > > > Thanks, > > Xuefu > > > > On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <thejas.n...@gmail.com> > wrote: > > > >> Hi Xuefu, > >> What is the plan you have in mind for a transition to using beeline > >> from within hive? > >> I assume there is going to be some translation from hive cli options > >> and commands to beeline. Is that right ? > >> Once the translation is in place, how would the switch happen ? > >> > >> I am thinking that once there is a hive-cli compatible beeline mode, > >> there can be an option to switch between beeline and hive cli codebase > >> . > >> For example, > >> In hive version X , when an environment variable CLI_USE_BEELINE=true > >> environment variable is set, "hive" command uses beeline underneath > >> (default remains cli codepath, so that users can start experimenting > >> with "hive" commands beeline mode). > >> In hive version Y > X, by default "hive" command starts using beeline > >> underneath. > >> > >> Is it something like this what you have in mind ? > >> > >> Thanks, > >> Thejas > >> > >> > >> > >> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xzh...@cloudera.com> > wrote: > >> > FYI, I have created an uber JIRA for this: > >> > https://issues.apache.org/jira/browse/HIVE-10511. > >> > > >> > Thanks, > >> > Xuefu > >> > > >> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xzh...@cloudera.com> > >> wrote: > >> > > >> >> Yes, Olga. I will create JIRAs to track those. > >> >> > >> >> Thanks, > >> >> Xuefu > >> >> > >> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich < > >> >> ol...@yahoo-inc.com.invalid> wrote: > >> >> > >> >>> We would need to build a test suite that makes sure that new > >> >>> implementation is compatible with the old one for users to adopt > it. We > >> >>> would also need some benchmarks to compare performance. Could you > >> please > >> >>> include this in the proposal as well. > >> >>> Thanks, > >> >>> Olga > >> >>> From: Xuefu Zhang <xzh...@cloudera.com> > >> >>> To: "dev@hive.apache.org" <dev@hive.apache.org> > >> >>> Sent: Monday, April 27, 2015 4:46 PM > >> >>> Subject: Re: [DISCUSS] Deprecating Hive CLI > >> >>> > >> >>> Existing implementation of Hive CLI will be replaced, so that Hive > >> >>> community don't need to maintain two code paths for the same thing. > >> That's > >> >>> basically what option #2 provides. > >> >>> > >> >>> > >> >>> > >> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov < > >> >>> apivova...@gmail.com> > >> >>> wrote: > >> >>> > >> >>> > Does it mean that existing Hive CLI will be killed? > >> >>> > > >> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xzh...@cloudera.com > > > >> >>> wrote: > >> >>> > > >> >>> > > To be precise, the proposal is NOT deprecating, but more of > >> changing > >> >>> the > >> >>> > > implementation of the Hive CLI using beeline, which seems in > >> >>> consensus. > >> >>> > > > >> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov < > >> >>> > apivova...@gmail.com > >> >>> > > > > >> >>> > > wrote: > >> >>> > > > >> >>> > > > I just started the survey on Deprecating Hive CLI. Please > share > >> you > >> >>> > > > opinion. > >> >>> > > > > >> >>> > > > Deprecating Hive CLI: > >> >>> > > > https://www.surveymonkey.com/s/XFHLM57 > >> >>> > > > > >> >>> > > > Results: > >> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/ > >> >>> > > > > >> >>> > > > > >> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov < > >> >>> > > apivova...@gmail.com > >> >>> > > > > > >> >>> > > > wrote: > >> >>> > > > > >> >>> > > > > Xuefu, > >> >>> > > > > > >> >>> > > > > I'm just saying that most of the shells (e.g. mysql or > >> accumulo) > >> >>> > > reserve > >> >>> > > > > -u for user. > >> >>> > > > > > >> >>> > > > > I believe lots of stuff in Hive take MySQL as an example. > >> >>> > > > > > >> >>> > > > > Alex > >> >>> > > > > > >> >>> > > > > > >> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang < > >> xzh...@cloudera.com > >> >>> > > >> >>> > > > wrote: > >> >>> > > > > > >> >>> > > > >> Alex, > >> >>> > > > >> > >> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not > >> mysql > >> >>> > and > >> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is > >> >>> relavent. > >> >>> > > > >> Regardless, I think we'd better have some writeup in the > >> proposed > >> >>> > uber > >> >>> > > > >> JIRA > >> >>> > > > >> so that everyone knows what we are signing up. > >> >>> > > > >> > >> >>> > > > >> Thanks, > >> >>> > > > >> Xuefu > >> >>> > > > >> > >> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov < > >> >>> > > > >> apivova...@gmail.com> > >> >>> > > > >> wrote: > >> >>> > > > >> > >> >>> > > > >> > Mysql and accumulo command line shells use -u to pass > <user> > >> >>> > > > >> > > >> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for > >> URL? > >> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" < > >> xzh...@cloudera.com> > >> >>> > > wrote: > >> >>> > > > >> > > >> >>> > > > >> > > Thanks to all for the input. I assume that we have a > >> >>> consensus > >> >>> > > that > >> >>> > > > >> we'd > >> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded > HS2 > >> >>> and > >> >>> > > make > >> >>> > > > >> user > >> >>> > > > >> > > transition as smooth as possible by identifying gaps > and > >> >>> fixing > >> >>> > > > >> issues. > >> >>> > > > >> > I'm > >> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track > the > >> >>> > > progress. > >> >>> > > > >> > Please > >> >>> > > > >> > > let me know if you have further questions. > >> >>> > > > >> > > > >> >>> > > > >> > > Thanks, > >> >>> > > > >> > > Xuefu > >> >>> > > > >> > > > >> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke < > >> >>> > > > >> lars.fran...@gmail.com> > >> >>> > > > >> > > wrote: > >> >>> > > > >> > > > >> >>> > > > >> > > > Yes, well put. It is about usability and "least > >> surprise". > >> >>> > > > >> > > > > >> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax > by > >> >>> default > >> >>> > > and > >> >>> > > > >> > could > >> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be > good. > >> >>> > > > >> > > > > >> >>> > > > >> > > > > >> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates < > >> >>> > > > alanfga...@gmail.com> > >> >>> > > > >> > > wrote: > >> >>> > > > >> > > > > >> >>> > > > >> > > >> If I understand correctly this is an argument about > >> >>> > usability, > >> >>> > > > not > >> >>> > > > >> > > >> functionality. So if Hive still had the CLI but it > >> >>> happened > >> >>> > to > >> >>> > > > use > >> >>> > > > >> > > either > >> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration) > >> >>> underneath > >> >>> > > your > >> >>> > > > >> > > concerns > >> >>> > > > >> > > >> would be addressed. Is that correct? > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Alan. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Lars Francke <lars.fran...@gmail.com> > >> >>> > > > >> > > >> April 23, 2015 at 15:53 > >> >>> > > > >> > > >> I've been at about 20 different customers in the > years > >> >>> since > >> >>> > > > >> Beeline > >> >>> > > > >> > has > >> >>> > > > >> > > >> been added. I can only think of a single one that > has > >> used > >> >>> > > > beeline. > >> >>> > > > >> > The > >> >>> > > > >> > > >> instinct is to use "hive", partially because it is > >> easy to > >> >>> > > > remember > >> >>> > > > >> > and > >> >>> > > > >> > > >> intuitive and because it is easier to use. I end up > >> >>> googling > >> >>> > > the > >> >>> > > > >> > stupid > >> >>> > > > >> > > >> JDBC syntax every single time. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> I know this might be a bit "out there" but I propose > >> >>> > something > >> >>> > > > >> else: > >> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive" > >> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or > "--beeline") > >> >>> option > >> >>> > to > >> >>> > > > the > >> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd > >> keep > >> >>> the > >> >>> > > CLI > >> >>> > > > as > >> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli" > >> option > >> >>> and > >> >>> > > > make > >> >>> > > > >> > > >> "hiveserver2/beeline" the default. > >> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive" > command > >> to > >> >>> get > >> >>> > > an > >> >>> > > > >> > > embedded > >> >>> > > > >> > > >> HS2 in Beeline > >> >>> > > > >> > > >> 4) Add some documentation to beeline reminding > people > >> on > >> >>> > > startup > >> >>> > > > of > >> >>> > > > >> > > >> beeline on how to connect and how to use embedded > mode > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> The fact is that the old shell just works for lots > of > >> >>> people > >> >>> > > and > >> >>> > > > >> > there's > >> >>> > > > >> > > >> just no need for beeline for these people. Also the > >> name > >> >>> is > >> >>> > > > >> confusing > >> >>> > > > >> > - > >> >>> > > > >> > > >> especially for non-native speakers. It's not a > common > >> >>> word so > >> >>> > > > it's > >> >>> > > > >> not > >> >>> > > > >> > > easy > >> >>> > > > >> > > >> to remember. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Alan Gates <alanfga...@gmail.com> > >> >>> > > > >> > > >> April 23, 2015 at 15:35 > >> >>> > > > >> > > >> Xuefu, thanks for getting this discussion started. > >> >>> Limiting > >> >>> > > our > >> >>> > > > >> code > >> >>> > > > >> > > >> paths is definitely a plus. My inclination would be > >> to go > >> >>> > > > towards > >> >>> > > > >> > > option > >> >>> > > > >> > > >> 2. A few questions: > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in > >> >>> beeline? > >> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an > >> >>> implicit > >> >>> > > HS2 > >> >>> > > > in > >> >>> > > > >> > > >> process when a user runs the CLI. Would this be > >> >>> available in > >> >>> > > > >> option 1 > >> >>> > > > >> > > as > >> >>> > > > >> > > >> well? > >> >>> > > > >> > > >> 3) Are there any performance implications, since now > >> >>> commands > >> >>> > > > have > >> >>> > > > >> to > >> >>> > > > >> > > hop > >> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded > mode? > >> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible > can we > >> >>> make > >> >>> > > it? > >> >>> > > > >> Will > >> >>> > > > >> > > >> users need to change any scripts they have that use > the > >> >>> CLI? > >> >>> > > Do > >> >>> > > > we > >> >>> > > > >> > have > >> >>> > > > >> > > >> tests that will make sure of this? > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Alan. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Xuefu Zhang <xzh...@cloudera.com> > >> >>> > > > >> > > >> April 23, 2015 at 14:43 > >> >>> > > > >> > > >> Hi all, > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> I'd like to revive the discussion about the fate of > >> Hive > >> >>> CLI, > >> >>> > > as > >> >>> > > > >> this > >> >>> > > > >> > > >> topic > >> >>> > > > >> > > >> has haunted us several times including [1][2]. It > looks > >> >>> to me > >> >>> > > > that > >> >>> > > > >> > there > >> >>> > > > >> > > >> is > >> >>> > > > >> > > >> a consensus that it's not wise for Hive community to > >> keep > >> >>> > both > >> >>> > > > Hive > >> >>> > > > >> > CLI > >> >>> > > > >> > > as > >> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't > >> believe > >> >>> that > >> >>> > > no > >> >>> > > > >> > action > >> >>> > > > >> > > is > >> >>> > > > >> > > >> the best action for us. From discussion so far, I > see > >> the > >> >>> > > > following > >> >>> > > > >> > > >> proposals: > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use > >> Beeline. > >> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with > >> embedded > >> >>> > > mode. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Frankly, I don't see much difference between the two > >> >>> > > approaches. > >> >>> > > > >> > Keeping > >> >>> > > > >> > > >> an > >> >>> > > > >> > > >> alias at script or even code level isn't that much > >> work. > >> >>> > > However, > >> >>> > > > >> > > >> shouldn't > >> >>> > > > >> > > >> we pick a direction and start moving to it? If > there is > >> >>> any > >> >>> > > gaps > >> >>> > > > >> > between > >> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify > and > >> >>> fill in > >> >>> > > > >> those. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> I'd love to hear the thoughts from the community and > >> hope > >> >>> > this > >> >>> > > > >> time we > >> >>> > > > >> > > >> will > >> >>> > > > >> > > >> have concrete action items to work on. > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> Thanks, > >> >>> > > > >> > > >> Xuefu > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> [1] > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> > >> >>> > > > >> > > > >> >>> > > > >> > > >> >>> > > > >> > >> >>> > > > > >> >>> > > > >> >>> > > >> >>> > >> > http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E > >> >>> > > > >> > > >> [2] > >> >>> > > > >> > >> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html > >> >>> > > > >> > > >> > >> >>> > > > >> > > >> > >> >>> > > > >> > > > > >> >>> > > > >> > > > >> >>> > > > >> > > >> >>> > > > >> > >> >>> > > > > > >> >>> > > > > > >> >>> > > > > >> >>> > > > >> >>> > > >> >>> > >> >>> > >> >>> > >> >>> > >> >> > >> >> > >> >