I believe there was an earlier thread on the same topic where we agreed that 1. backward compatibility is key2. performance needs to be comparable. cli is very widely used at Yahoo right now and for us to go this route we need to make sure that we can just alias cli to beeline and the vast majority of users would not see a difference. Olga From: Xuefu Zhang <xzh...@cloudera.com> Date: April 23, 2015 at 16:06 To: "dev@hive.apache.org" <dev@hive.apache.org> Subject: [DISCUSS] Deprecating Hive CLI Hi Alan,
Here is my understanding to the questions you asked: Re 1): There used to be many gaps, but majority if not all of them are filled. One of the action item out of this discussion is to identity and remaining gaps. Re 2): if you run "beeline -u jdbc:hive2://", there will be a HS2 embedded in the beeline process. We can change the shell so that user just need to type "beeline" for embedded HS2. Re 3): I don't know if there will be any perf penalty and how much for beeline + embedded HS2. I'm also not certain if there is such a loop (to be found out). If so, I don't believe the perf impact would be noticeable. Re 4): Sort of related to #1. The goal of this discussion is to choose a route and take whatever actions to make it backward compatibility at functional level. Choosing option 1 of course require user to change their script, while option 2 doesn't. Testing is also an action item. FYI, I had an old blog post [1], though outdated, would give some ideas of the difference between beeline and Hive CLI at the time of writing. Thanks, Xuefu [1] http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/ From: Alan Gates <alanfga...@gmail.com> Date: April 23, 2015 at 15:35 To: dev@hive.apache.org Subject: [DISCUSS] Deprecating Hive CLI Xuefu, thanks for getting this discussion started. Limiting our code paths is definitely a plus. My inclination would be to go towards option 2. A few questions: 1) Is there any functionality in CLI that's not in beeline? 2) If I understand correctly option 2 would have an implicit HS2 in process when a user runs the CLI. Would this be available in option 1 as well? 3) Are there any performance implications, since now commands have to hop through a thrift/jdbc loop even in the embedded mode? 4) If we choose option 2 how backward compatible can we make it? Will users need to change any scripts they have that use the CLI? Do we have tests that will make sure of this? Alan. From: Xuefu Zhang <xzh...@cloudera.com> Date: April 23, 2015 at 14:43 To: "dev@hive.apache.org" <dev@hive.apache.org> Subject: [DISCUSS] Deprecating Hive CLI Hi all, I'd like to revive the discussion about the fate of Hive CLI, as this topic has haunted us several times including [1][2]. It looks to me that there is a consensus that it's not wise for Hive community to keep both Hive CLI as it is as well as Beeline + HS2. However, I don't believe that no action is the best action for us. From discussion so far, I see the following proposals: 1. Deprecating Hive CLI and advise that users use Beeline. 2. Make Hive CLI as naming flavor to beeline with embedded mode. Frankly, I don't see much difference between the two approaches. Keeping an alias at script or even code level isn't that much work. However, shouldn't we pick a direction and start moving to it? If there is any gaps between beeline embedded and Hive CLI, we should identify and fill in those. I'd love to hear the thoughts from the community and hope this time we will have concrete action items to work on. Thanks, Xuefu [1] http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E [2] https://www.mail-archive.com/dev@hive.apache.org/msg112378.html