One way people have gotten around the lack of LDAP connectivity in HS2 has been 
to use Apache Knox.  That project’s goal is to provide a single login 
capability for Hadoop related projects so that users can tie their LDAP or 
Active Directory servers into Hadoop.

Alan.

> On Mar 8, 2016, at 16:00, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> The current scenario resembles a three tier architecture but without the 
> security of second tier. In a typical three-tier you have users connecting to 
> the application server (read Hive server2) are independently authenticated 
> and if OK, the second tier creates new ,NET type or JDBC threads to connect 
> to database much like multi-threading. The problem I believe is that Hive 
> server 2 does not have that concept of handling the individual loggings yet. 
> Hive server 2 should be able to handle LDAP logins as well. It is a useful 
> layer to have.
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
>  
> 
> On 8 March 2016 at 23:28, Alex <this.side.of.confus...@gmail.com> wrote:
> Yes, when creating a Hive Context a Hive Metastore client should be created 
> with a user that the Spark application will talk to the *remote* Hive 
> Metastore with. We would like to add a custom authorization plugin to our 
> remote Hive Metastore to authorize the query requests that the spark 
> application is submitting which would also add authorization for any other 
> applications hitting the Hive Metastore. Furthermore we would like to extend 
> this so that we can submit "jobs" to our Spark application that will allow us 
> to run against the metastore as different users while leveraging the 
> abilities of our spark cluster. But as you mentioned only one login connects 
> to the Hive Metastore is shared among all HiveContext sessions.
> 
> Likely the authentication would have to be completed either through a secured 
> Hive Metastore (Kerberos) or by having the requests go through HiveServer2.
> 
> --Alex
> 
> 
> On 3/8/2016 3:13 PM, Mich Talebzadeh wrote:
>> Hi,
>> 
>> What do you mean by Hive Metastore Client? Are you referring to Hive server 
>> login much like beeline?
>> 
>> Spark uses hive-site.xml to get the details of Hive metastore and the login 
>> to the metastore which could be any database. Mine is Oracle and as far as I 
>> know even in  Hive 2, hive-site.xml has an entry for 
>> javax.jdo.option.ConnectionUserName that specifies username to use against 
>> metastore database. These are all multi-threaded JDBC connections to the 
>> database, the same login as shown below:
>> 
>> LOGIN    SID/serial# LOGGED IN S HOST       OS PID         Client PID     
>> PROGRAM               MEM/KB      Logical I/O Physical I/O ACT
>> -------- ----------- ----------- ---------- -------------- -------------- 
>> --------------- ------------ ---------------- ------------ ---
>> INFO
>> -------
>> HIVEUSER 67,6160     08/03 08:11 rhes564    oracle/20539   hduser/1234    
>> JDBC Thin Clien        1,017               37            0 N
>> HIVEUSER 89,6421     08/03 08:11 rhes564    oracle/20541   hduser/1234    
>> JDBC Thin Clien        1,081              528            0 N
>> HIVEUSER 112,561     08/03 10:45 rhes564    oracle/24624   hduser/1234    
>> JDBC Thin Clien          889               37            0 N
>> HIVEUSER 131,8811    08/03 08:11 rhes564    oracle/20543   hduser/1234    
>> JDBC Thin Clien        1,017               37            0 N
>> HIVEUSER 47,30114    08/03 10:45 rhes564    oracle/24626   hduser/1234    
>> JDBC Thin Clien        1,017               37            0 N
>> HIVEUSER 170,8955    08/03 08:11 rhes564    oracle/20545   hduser/1234    
>> JDBC Thin Clien        1,017              323            0 N
>> 
>> As I understand what you are suggesting is that each Spark user uses 
>> different login to connect to Hive metastore. As of now there is only one 
>> login that connects to Hive metastore shared among all
>> 
>> 2016-03-08T23:08:01,890 INFO  [pool-5-thread-72]: HiveMetaStore.audit 
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser      ip=50.140.197.217  
>>      cmd=source:50.140.197.217 get_table : db=test tbl=t
>> 2016-03-08T23:18:10,432 INFO  [pool-5-thread-81]: HiveMetaStore.audit 
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser      ip=50.140.197.216  
>>      cmd=source:50.140.197.216 get_tables: db=asehadoop pat=.*
>> 
>> And this is an entry in Hive log when connection is made theough Zeppelin UI
>> 
>> 2016-03-08T23:20:13,546 INFO  [pool-5-thread-84]: metastore.HiveMetaStore 
>> (HiveMetaStore.java:newRawStore(499)) - 84: Opening raw store with 
>> implementation class:org.apache.hadoop.hive.metastore.ObjectStore
>> 2016-03-08T23:20:13,547 INFO  [pool-5-thread-84]: metastore.ObjectStore 
>> (ObjectStore.java:initialize(318)) - ObjectStore, initialize called
>> 2016-03-08T23:20:13,550 INFO  [pool-5-thread-84]: 
>> metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(142)) - Using 
>> direct SQL, underlying DB is ORACLE
>> 2016-03-08T23:20:13,550 INFO  [pool-5-thread-84]: metastore.ObjectStore 
>> (ObjectStore.java:setConf(301)) - Initialized ObjectStore
>> 
>> I am not sure there is currently such plan to have different logins allowed 
>> to Hive Metastore. But it will add another level of security. Though I am 
>> not sure how this would be authenticated.
>> 
>> HTH
>> 
>>  
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> http://talebzadehmich.wordpress.com
>>  
>> 
>> On 8 March 2016 at 22:23, Alex F <this.side.of.confus...@gmail.com> wrote:
>> As of Spark 1.6.0 it is now possible to create new Hive Context sessions 
>> sharing various components but right now the Hive Metastore Client is shared 
>> amongst each new Hive Context Session.
>> 
>> Are there any plans to create individual Metastore Clients for each Hive 
>> Context?
>> 
>> Related to the question above are there any plans to create an interface for 
>> customizing the username that the Metastore Client uses to connect to the 
>> Hive Metastore? Right now it either uses the user specified in an 
>> environment variable or the application's process owner. 
>> 
> 
> 

Reply via email to