I believe there was an earlier thread on the same topic where we agreed that
1. backward compatibility is key2. performance needs to be comparable.
cli is very widely used at Yahoo right now and for us to go this route we need 
to make sure that we can just alias cli to beeline and the vast majority of 
users would not see a difference.
Olga
     
  From: Xuefu Zhang <xzh...@cloudera.com>
Date: April 23, 2015 at 16:06
To: "dev@hive.apache.org" <dev@hive.apache.org>
Subject: [DISCUSS] Deprecating Hive CLI
  Hi Alan,

Here is my understanding to the questions you asked:

Re 1): There used to be many gaps, but majority if not all of them are filled. 
One of the action item out of this discussion is to identity and remaining gaps.

Re 2): if you run "beeline -u jdbc:hive2://", there will be a HS2 embedded in 
the beeline process. We can change the shell so that user just need to type 
"beeline" for embedded HS2.

Re 3): I don't know if there will be any perf penalty and how much for beeline 
+ embedded HS2. I'm also not certain if there is such a loop (to be found out). 
If so, I don't believe the perf impact would be noticeable.

Re 4): Sort of related to #1. The goal of this discussion is to choose a route 
and take whatever actions to make it backward compatibility at functional 
level. Choosing option 1 of course require user to change their script, while 
option 2 doesn't. Testing is also an action item.

FYI, I had an old blog post [1], though outdated, would give some ideas of the 
difference between beeline and Hive CLI at the time of writing.

Thanks,
Xuefu

[1] 
http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/


     From: Alan Gates <alanfga...@gmail.com>
Date: April 23, 2015 at 15:35
To: dev@hive.apache.org
Subject: [DISCUSS] Deprecating Hive CLI
  
Xuefu, thanks for getting this discussion started.  Limiting our code paths is 
definitely a plus.  My inclination would be to go towards option 2.  A few 
questions:

1) Is there any functionality in CLI that's not in beeline?  
2) If I understand correctly option 2 would have an implicit HS2 in process 
when a user runs the CLI.  Would this be available in option 1 as well?
3) Are there any performance implications, since now commands have to hop 
through a thrift/jdbc loop even in the embedded mode?
4) If we choose option 2 how backward compatible can we make it?  Will users 
need to change any scripts they have that use the CLI?  Do we have tests that 
will make sure of this?

Alan.

    From: Xuefu Zhang <xzh...@cloudera.com>
Date: April 23, 2015 at 14:43
To: "dev@hive.apache.org" <dev@hive.apache.org>
Subject: [DISCUSS] Deprecating Hive CLI


  Hi all,

I'd like to revive the discussion about the fate of Hive CLI, as this topic
has haunted us several times including [1][2]. It looks to me that there is
a consensus that it's not wise for Hive community to keep both Hive CLI as
it is as well as Beeline + HS2. However, I don't believe that no action is
the best action for us. From discussion so far, I see the following
proposals:

1. Deprecating Hive CLI and advise that users use Beeline.
2. Make Hive CLI as naming flavor to beeline with embedded mode.

Frankly, I don't see much difference between the two approaches. Keeping an
alias at script or even code level isn't that much work. However, shouldn't
we pick a direction and start moving to it? If there is any gaps between
beeline embedded and Hive CLI, we should identify and fill in those.

I'd love to hear the thoughts from the community and hope this time we will
have concrete action items to work on.

Thanks,
Xuefu

[1]
http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
[2] https://www.mail-archive.com/dev@hive.apache.org/msg112378.html



  

Reply via email to