That sounds fine to me. My main concern is was that we should allow users to switch back if they encounter some corner case bugs, for at least a release or two. Yes, we can add that warning as well.
On Thu, Apr 30, 2015 at 6:15 PM, Xuefu Zhang <xzh...@cloudera.com> wrote: > Okay. That's fine. I think supporting an env variable doesn't take much. > What about enabling the new code path by default, and allowing user to > opt-out or in case of a serious bug? We also give user an warning that the > env variable may be discontinued in the future. > > thanks, > Xuefu > > On Thu, Apr 30, 2015 at 5:13 PM, Thejas Nair <thejas.n...@gmail.com> wrote: > >> In most cases with hive, when a major implementation change is made, >> we usually provide the user to fallback to older implementation. For >> example, when CBO was added, it was initially not enabled by default, >> and there still option of using non-CBO path. When new hadoop major >> versions are added, we still give users option of using older hadoop >> versions for some time. Or in case of jdbc, we allowed users to choose >> between HiveServer1 and 2 for sometime. Even with putting good effort >> into testing, some corner cases sometimes get missed. >> >> On similar lines, it would be good to let opt-in for a release, and >> then switch the default in the next release. Given that we have been >> making new releases of hive every few months, I don't see this as a >> big issue. I think we should at the minimum allow users to opt out of >> new implementation for a release or so (if they encounter bugs). >> >> Most of the work is going to be in ensuring the compatibility. >> Supporting a flag to choose implementation should be relatively >> simpler work. What do you think ? >> >> >> >> >> >> >> >> >> >> On Thu, Apr 30, 2015 at 4:42 PM, Xuefu Zhang <xzh...@cloudera.com> wrote: >> > Hi Thejas, >> > >> > Thanks for your input. I thought about this, but I don't really feel it >> > necessary to have a "transition" stage. After all, Hive CLI is a command >> > line tool with well-defined command line options. That's the "interface" >> > that we need to support. We are just changing the implementation. Through >> > comprehensive testing, we hope to discover most of the issues. >> > >> > On the other hand, if we have such an transition, there might never be a >> > user bothering to flip the env variable and the transition doesn't really >> > build up more confidence. >> > >> > In addition, if we provide either a transition or switch for every >> > implementation change, wouldn't users be overwhelmed by those transitions >> > or switches. >> > >> > Thoughts? >> > >> > Thanks, >> > Xuefu >> > >> > On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <thejas.n...@gmail.com> >> wrote: >> > >> >> Hi Xuefu, >> >> What is the plan you have in mind for a transition to using beeline >> >> from within hive? >> >> I assume there is going to be some translation from hive cli options >> >> and commands to beeline. Is that right ? >> >> Once the translation is in place, how would the switch happen ? >> >> >> >> I am thinking that once there is a hive-cli compatible beeline mode, >> >> there can be an option to switch between beeline and hive cli codebase >> >> . >> >> For example, >> >> In hive version X , when an environment variable CLI_USE_BEELINE=true >> >> environment variable is set, "hive" command uses beeline underneath >> >> (default remains cli codepath, so that users can start experimenting >> >> with "hive" commands beeline mode). >> >> In hive version Y > X, by default "hive" command starts using beeline >> >> underneath. >> >> >> >> Is it something like this what you have in mind ? >> >> >> >> Thanks, >> >> Thejas >> >> >> >> >> >> >> >> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xzh...@cloudera.com> >> wrote: >> >> > FYI, I have created an uber JIRA for this: >> >> > https://issues.apache.org/jira/browse/HIVE-10511. >> >> > >> >> > Thanks, >> >> > Xuefu >> >> > >> >> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xzh...@cloudera.com> >> >> wrote: >> >> > >> >> >> Yes, Olga. I will create JIRAs to track those. >> >> >> >> >> >> Thanks, >> >> >> Xuefu >> >> >> >> >> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich < >> >> >> ol...@yahoo-inc.com.invalid> wrote: >> >> >> >> >> >>> We would need to build a test suite that makes sure that new >> >> >>> implementation is compatible with the old one for users to adopt >> it. We >> >> >>> would also need some benchmarks to compare performance. Could you >> >> please >> >> >>> include this in the proposal as well. >> >> >>> Thanks, >> >> >>> Olga >> >> >>> From: Xuefu Zhang <xzh...@cloudera.com> >> >> >>> To: "dev@hive.apache.org" <dev@hive.apache.org> >> >> >>> Sent: Monday, April 27, 2015 4:46 PM >> >> >>> Subject: Re: [DISCUSS] Deprecating Hive CLI >> >> >>> >> >> >>> Existing implementation of Hive CLI will be replaced, so that Hive >> >> >>> community don't need to maintain two code paths for the same thing. >> >> That's >> >> >>> basically what option #2 provides. >> >> >>> >> >> >>> >> >> >>> >> >> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov < >> >> >>> apivova...@gmail.com> >> >> >>> wrote: >> >> >>> >> >> >>> > Does it mean that existing Hive CLI will be killed? >> >> >>> > >> >> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xzh...@cloudera.com >> > >> >> >>> wrote: >> >> >>> > >> >> >>> > > To be precise, the proposal is NOT deprecating, but more of >> >> changing >> >> >>> the >> >> >>> > > implementation of the Hive CLI using beeline, which seems in >> >> >>> consensus. >> >> >>> > > >> >> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov < >> >> >>> > apivova...@gmail.com >> >> >>> > > > >> >> >>> > > wrote: >> >> >>> > > >> >> >>> > > > I just started the survey on Deprecating Hive CLI. Please >> share >> >> you >> >> >>> > > > opinion. >> >> >>> > > > >> >> >>> > > > Deprecating Hive CLI: >> >> >>> > > > https://www.surveymonkey.com/s/XFHLM57 >> >> >>> > > > >> >> >>> > > > Results: >> >> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/ >> >> >>> > > > >> >> >>> > > > >> >> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov < >> >> >>> > > apivova...@gmail.com >> >> >>> > > > > >> >> >>> > > > wrote: >> >> >>> > > > >> >> >>> > > > > Xuefu, >> >> >>> > > > > >> >> >>> > > > > I'm just saying that most of the shells (e.g. mysql or >> >> accumulo) >> >> >>> > > reserve >> >> >>> > > > > -u for user. >> >> >>> > > > > >> >> >>> > > > > I believe lots of stuff in Hive take MySQL as an example. >> >> >>> > > > > >> >> >>> > > > > Alex >> >> >>> > > > > >> >> >>> > > > > >> >> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang < >> >> xzh...@cloudera.com >> >> >>> > >> >> >>> > > > wrote: >> >> >>> > > > > >> >> >>> > > > >> Alex, >> >> >>> > > > >> >> >> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not >> >> mysql >> >> >>> > and >> >> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is >> >> >>> relavent. >> >> >>> > > > >> Regardless, I think we'd better have some writeup in the >> >> proposed >> >> >>> > uber >> >> >>> > > > >> JIRA >> >> >>> > > > >> so that everyone knows what we are signing up. >> >> >>> > > > >> >> >> >>> > > > >> Thanks, >> >> >>> > > > >> Xuefu >> >> >>> > > > >> >> >> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov < >> >> >>> > > > >> apivova...@gmail.com> >> >> >>> > > > >> wrote: >> >> >>> > > > >> >> >> >>> > > > >> > Mysql and accumulo command line shells use -u to pass >> <user> >> >> >>> > > > >> > >> >> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for >> >> URL? >> >> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" < >> >> xzh...@cloudera.com> >> >> >>> > > wrote: >> >> >>> > > > >> > >> >> >>> > > > >> > > Thanks to all for the input. I assume that we have a >> >> >>> consensus >> >> >>> > > that >> >> >>> > > > >> we'd >> >> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded >> HS2 >> >> >>> and >> >> >>> > > make >> >> >>> > > > >> user >> >> >>> > > > >> > > transition as smooth as possible by identifying gaps >> and >> >> >>> fixing >> >> >>> > > > >> issues. >> >> >>> > > > >> > I'm >> >> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track >> the >> >> >>> > > progress. >> >> >>> > > > >> > Please >> >> >>> > > > >> > > let me know if you have further questions. >> >> >>> > > > >> > > >> >> >>> > > > >> > > Thanks, >> >> >>> > > > >> > > Xuefu >> >> >>> > > > >> > > >> >> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke < >> >> >>> > > > >> lars.fran...@gmail.com> >> >> >>> > > > >> > > wrote: >> >> >>> > > > >> > > >> >> >>> > > > >> > > > Yes, well put. It is about usability and "least >> >> surprise". >> >> >>> > > > >> > > > >> >> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax >> by >> >> >>> default >> >> >>> > > and >> >> >>> > > > >> > could >> >> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be >> good. >> >> >>> > > > >> > > > >> >> >>> > > > >> > > > >> >> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates < >> >> >>> > > > alanfga...@gmail.com> >> >> >>> > > > >> > > wrote: >> >> >>> > > > >> > > > >> >> >>> > > > >> > > >> If I understand correctly this is an argument about >> >> >>> > usability, >> >> >>> > > > not >> >> >>> > > > >> > > >> functionality. So if Hive still had the CLI but it >> >> >>> happened >> >> >>> > to >> >> >>> > > > use >> >> >>> > > > >> > > either >> >> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration) >> >> >>> underneath >> >> >>> > > your >> >> >>> > > > >> > > concerns >> >> >>> > > > >> > > >> would be addressed. Is that correct? >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Alan. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Lars Francke <lars.fran...@gmail.com> >> >> >>> > > > >> > > >> April 23, 2015 at 15:53 >> >> >>> > > > >> > > >> I've been at about 20 different customers in the >> years >> >> >>> since >> >> >>> > > > >> Beeline >> >> >>> > > > >> > has >> >> >>> > > > >> > > >> been added. I can only think of a single one that >> has >> >> used >> >> >>> > > > beeline. >> >> >>> > > > >> > The >> >> >>> > > > >> > > >> instinct is to use "hive", partially because it is >> >> easy to >> >> >>> > > > remember >> >> >>> > > > >> > and >> >> >>> > > > >> > > >> intuitive and because it is easier to use. I end up >> >> >>> googling >> >> >>> > > the >> >> >>> > > > >> > stupid >> >> >>> > > > >> > > >> JDBC syntax every single time. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> I know this might be a bit "out there" but I propose >> >> >>> > something >> >> >>> > > > >> else: >> >> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive" >> >> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or >> "--beeline") >> >> >>> option >> >> >>> > to >> >> >>> > > > the >> >> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd >> >> keep >> >> >>> the >> >> >>> > > CLI >> >> >>> > > > as >> >> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli" >> >> option >> >> >>> and >> >> >>> > > > make >> >> >>> > > > >> > > >> "hiveserver2/beeline" the default. >> >> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive" >> command >> >> to >> >> >>> get >> >> >>> > > an >> >> >>> > > > >> > > embedded >> >> >>> > > > >> > > >> HS2 in Beeline >> >> >>> > > > >> > > >> 4) Add some documentation to beeline reminding >> people >> >> on >> >> >>> > > startup >> >> >>> > > > of >> >> >>> > > > >> > > >> beeline on how to connect and how to use embedded >> mode >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> The fact is that the old shell just works for lots >> of >> >> >>> people >> >> >>> > > and >> >> >>> > > > >> > there's >> >> >>> > > > >> > > >> just no need for beeline for these people. Also the >> >> name >> >> >>> is >> >> >>> > > > >> confusing >> >> >>> > > > >> > - >> >> >>> > > > >> > > >> especially for non-native speakers. It's not a >> common >> >> >>> word so >> >> >>> > > > it's >> >> >>> > > > >> not >> >> >>> > > > >> > > easy >> >> >>> > > > >> > > >> to remember. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Alan Gates <alanfga...@gmail.com> >> >> >>> > > > >> > > >> April 23, 2015 at 15:35 >> >> >>> > > > >> > > >> Xuefu, thanks for getting this discussion started. >> >> >>> Limiting >> >> >>> > > our >> >> >>> > > > >> code >> >> >>> > > > >> > > >> paths is definitely a plus. My inclination would be >> >> to go >> >> >>> > > > towards >> >> >>> > > > >> > > option >> >> >>> > > > >> > > >> 2. A few questions: >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in >> >> >>> beeline? >> >> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an >> >> >>> implicit >> >> >>> > > HS2 >> >> >>> > > > in >> >> >>> > > > >> > > >> process when a user runs the CLI. Would this be >> >> >>> available in >> >> >>> > > > >> option 1 >> >> >>> > > > >> > > as >> >> >>> > > > >> > > >> well? >> >> >>> > > > >> > > >> 3) Are there any performance implications, since now >> >> >>> commands >> >> >>> > > > have >> >> >>> > > > >> to >> >> >>> > > > >> > > hop >> >> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded >> mode? >> >> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible >> can we >> >> >>> make >> >> >>> > > it? >> >> >>> > > > >> Will >> >> >>> > > > >> > > >> users need to change any scripts they have that use >> the >> >> >>> CLI? >> >> >>> > > Do >> >> >>> > > > we >> >> >>> > > > >> > have >> >> >>> > > > >> > > >> tests that will make sure of this? >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Alan. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Xuefu Zhang <xzh...@cloudera.com> >> >> >>> > > > >> > > >> April 23, 2015 at 14:43 >> >> >>> > > > >> > > >> Hi all, >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> I'd like to revive the discussion about the fate of >> >> Hive >> >> >>> CLI, >> >> >>> > > as >> >> >>> > > > >> this >> >> >>> > > > >> > > >> topic >> >> >>> > > > >> > > >> has haunted us several times including [1][2]. It >> looks >> >> >>> to me >> >> >>> > > > that >> >> >>> > > > >> > there >> >> >>> > > > >> > > >> is >> >> >>> > > > >> > > >> a consensus that it's not wise for Hive community to >> >> keep >> >> >>> > both >> >> >>> > > > Hive >> >> >>> > > > >> > CLI >> >> >>> > > > >> > > as >> >> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't >> >> believe >> >> >>> that >> >> >>> > > no >> >> >>> > > > >> > action >> >> >>> > > > >> > > is >> >> >>> > > > >> > > >> the best action for us. From discussion so far, I >> see >> >> the >> >> >>> > > > following >> >> >>> > > > >> > > >> proposals: >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use >> >> Beeline. >> >> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with >> >> embedded >> >> >>> > > mode. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Frankly, I don't see much difference between the two >> >> >>> > > approaches. >> >> >>> > > > >> > Keeping >> >> >>> > > > >> > > >> an >> >> >>> > > > >> > > >> alias at script or even code level isn't that much >> >> work. >> >> >>> > > However, >> >> >>> > > > >> > > >> shouldn't >> >> >>> > > > >> > > >> we pick a direction and start moving to it? If >> there is >> >> >>> any >> >> >>> > > gaps >> >> >>> > > > >> > between >> >> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify >> and >> >> >>> fill in >> >> >>> > > > >> those. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> I'd love to hear the thoughts from the community and >> >> hope >> >> >>> > this >> >> >>> > > > >> time we >> >> >>> > > > >> > > >> will >> >> >>> > > > >> > > >> have concrete action items to work on. >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> Thanks, >> >> >>> > > > >> > > >> Xuefu >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> [1] >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> >> >>> > > > >> > >> >> >>> > > > >> >> >> >>> > > > >> >> >>> > > >> >> >>> > >> >> >>> >> >> >> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E >> >> >>> > > > >> > > >> [2] >> >> >>> > > > >> >> >> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > >> >> >> >>> > > > >> > > > >> >> >>> > > > >> > > >> >> >>> > > > >> > >> >> >>> > > > >> >> >> >>> > > > > >> >> >>> > > > > >> >> >>> > > > >> >> >>> > > >> >> >>> > >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >> >> >> >> >> >> >>