Multi-user use and tracking in Hive with Cloudera Hadoop

Jay Ramadorai Mon, 14 Feb 2011 08:01:35 -0800

I want different clients from various machines outside the cluster running 
queries on hive as different users, and I want to be able to track who is 
running what, and use the Fair Scheduler or something similar as governor to 
throttle usage. So bottom line I need fine grained tracking and control at a 
user and user group level.


Is this possible, for 
(a) remote clients connecting to the Derby metastore listener and running Hive 
queries from their hive clients as different users?
(b) remote clients connecting with JDBC through Thrift to Hive

I'm running Hive from Apache trunk (0.7.0) on top of Cloudera Hadoop CDH3b3. 
The Hive Thrift server is running as the user hive, and the hive tables are 
owned by linux user hive.

Must I use something like Kerberos to make this work or is there an 
alternative? In fact, will Kerberos even help in achieving the above?

Multi-user use and tracking in Hive with Cloudera Hadoop

Reply via email to