One way people have gotten around the lack of LDAP connectivity in HS2 has been to use Apache Knox. That project’s goal is to provide a single login capability for Hadoop related projects so that users can tie their LDAP or Active Directory servers into Hadoop.
Alan. > On Mar 8, 2016, at 16:00, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > The current scenario resembles a three tier architecture but without the > security of second tier. In a typical three-tier you have users connecting to > the application server (read Hive server2) are independently authenticated > and if OK, the second tier creates new ,NET type or JDBC threads to connect > to database much like multi-threading. The problem I believe is that Hive > server 2 does not have that concept of handling the individual loggings yet. > Hive server 2 should be able to handle LDAP logins as well. It is a useful > layer to have. > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > http://talebzadehmich.wordpress.com > > > On 8 March 2016 at 23:28, Alex <this.side.of.confus...@gmail.com> wrote: > Yes, when creating a Hive Context a Hive Metastore client should be created > with a user that the Spark application will talk to the *remote* Hive > Metastore with. We would like to add a custom authorization plugin to our > remote Hive Metastore to authorize the query requests that the spark > application is submitting which would also add authorization for any other > applications hitting the Hive Metastore. Furthermore we would like to extend > this so that we can submit "jobs" to our Spark application that will allow us > to run against the metastore as different users while leveraging the > abilities of our spark cluster. But as you mentioned only one login connects > to the Hive Metastore is shared among all HiveContext sessions. > > Likely the authentication would have to be completed either through a secured > Hive Metastore (Kerberos) or by having the requests go through HiveServer2. > > --Alex > > > On 3/8/2016 3:13 PM, Mich Talebzadeh wrote: >> Hi, >> >> What do you mean by Hive Metastore Client? Are you referring to Hive server >> login much like beeline? >> >> Spark uses hive-site.xml to get the details of Hive metastore and the login >> to the metastore which could be any database. Mine is Oracle and as far as I >> know even in Hive 2, hive-site.xml has an entry for >> javax.jdo.option.ConnectionUserName that specifies username to use against >> metastore database. These are all multi-threaded JDBC connections to the >> database, the same login as shown below: >> >> LOGIN SID/serial# LOGGED IN S HOST OS PID Client PID >> PROGRAM MEM/KB Logical I/O Physical I/O ACT >> -------- ----------- ----------- ---------- -------------- -------------- >> --------------- ------------ ---------------- ------------ --- >> INFO >> ------- >> HIVEUSER 67,6160 08/03 08:11 rhes564 oracle/20539 hduser/1234 >> JDBC Thin Clien 1,017 37 0 N >> HIVEUSER 89,6421 08/03 08:11 rhes564 oracle/20541 hduser/1234 >> JDBC Thin Clien 1,081 528 0 N >> HIVEUSER 112,561 08/03 10:45 rhes564 oracle/24624 hduser/1234 >> JDBC Thin Clien 889 37 0 N >> HIVEUSER 131,8811 08/03 08:11 rhes564 oracle/20543 hduser/1234 >> JDBC Thin Clien 1,017 37 0 N >> HIVEUSER 47,30114 08/03 10:45 rhes564 oracle/24626 hduser/1234 >> JDBC Thin Clien 1,017 37 0 N >> HIVEUSER 170,8955 08/03 08:11 rhes564 oracle/20545 hduser/1234 >> JDBC Thin Clien 1,017 323 0 N >> >> As I understand what you are suggesting is that each Spark user uses >> different login to connect to Hive metastore. As of now there is only one >> login that connects to Hive metastore shared among all >> >> 2016-03-08T23:08:01,890 INFO [pool-5-thread-72]: HiveMetaStore.audit >> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser ip=50.140.197.217 >> cmd=source:50.140.197.217 get_table : db=test tbl=t >> 2016-03-08T23:18:10,432 INFO [pool-5-thread-81]: HiveMetaStore.audit >> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser ip=50.140.197.216 >> cmd=source:50.140.197.216 get_tables: db=asehadoop pat=.* >> >> And this is an entry in Hive log when connection is made theough Zeppelin UI >> >> 2016-03-08T23:20:13,546 INFO [pool-5-thread-84]: metastore.HiveMetaStore >> (HiveMetaStore.java:newRawStore(499)) - 84: Opening raw store with >> implementation class:org.apache.hadoop.hive.metastore.ObjectStore >> 2016-03-08T23:20:13,547 INFO [pool-5-thread-84]: metastore.ObjectStore >> (ObjectStore.java:initialize(318)) - ObjectStore, initialize called >> 2016-03-08T23:20:13,550 INFO [pool-5-thread-84]: >> metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(142)) - Using >> direct SQL, underlying DB is ORACLE >> 2016-03-08T23:20:13,550 INFO [pool-5-thread-84]: metastore.ObjectStore >> (ObjectStore.java:setConf(301)) - Initialized ObjectStore >> >> I am not sure there is currently such plan to have different logins allowed >> to Hive Metastore. But it will add another level of security. Though I am >> not sure how this would be authenticated. >> >> HTH >> >> >> >> Dr Mich Talebzadeh >> >> LinkedIn >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> >> http://talebzadehmich.wordpress.com >> >> >> On 8 March 2016 at 22:23, Alex F <this.side.of.confus...@gmail.com> wrote: >> As of Spark 1.6.0 it is now possible to create new Hive Context sessions >> sharing various components but right now the Hive Metastore Client is shared >> amongst each new Hive Context Session. >> >> Are there any plans to create individual Metastore Clients for each Hive >> Context? >> >> Related to the question above are there any plans to create an interface for >> customizing the username that the Metastore Client uses to connect to the >> Hive Metastore? Right now it either uses the user specified in an >> environment variable or the application's process owner. >> > >