You might also enable debug in: hadoop-env.sh
# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true
-Dsun.security.krb5.debug=true ${HADOOP_OPTS}”
and check that the principals are the same on the NameNode and DataNode.
and you can confirm the same on all nodes in hdfs-site.xml.
You can also ensure all nodes in the cluster are kerberized in core-site.xml
(no auth by default) :
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
<description>Set the authentication for the cluster. Valid values are:
simple or kerberos.
</description>
</property>
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
Best Regards
Frank
> On May 22, 2015, at 4:25 AM, Ted Yu <[email protected]> wrote:
>
> Can you share the exception(s) you encountered ?
>
> Thanks
>
>
>
> On May 22, 2015, at 12:33 AM, donhoff_h <[email protected]> wrote:
>
>> Hi,
>>
>> My modified code is listed below, just add the SecurityUtil API. I don't
>> know which propertyKeys I should use, so I make 2 my own propertyKeys to
>> find the keytab and principal.
>>
>> object TestHBaseRead2 {
>> def main(args: Array[String]) {
>>
>> val conf = new SparkConf()
>> val sc = new SparkContext(conf)
>> val hbConf = HBaseConfiguration.create()
>> hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab")
>> hbConf.set("dhao.user.principal","[email protected]")
>> SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal")
>> val conn = ConnectionFactory.createConnection(hbConf)
>> val tbl = conn.getTable(TableName.valueOf("spark_t01"))
>> try {
>> val get = new Get(Bytes.toBytes("row01"))
>> val res = tbl.get(get)
>> println("result:"+res.toString)
>> }
>> finally {
>> tbl.close()
>> conn.close()
>> es.shutdown()
>> }
>>
>> val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10))
>> val v = rdd.sum()
>> println("Value="+v)
>> sc.stop()
>>
>> }
>> }
>>
>>
>> ------------------ 原始邮件 ------------------
>> 发件人: "yuzhihong";<[email protected]>;
>> 发送时间: 2015年5月22日(星期五) 下午3:25
>> 收件人: "donhoff_h"<[email protected]>;
>> 抄送: "Bill Q"<[email protected]>; "user"<[email protected]>;
>> 主题: Re: 回复: How to use spark to access HBase with Security enabled
>>
>> Can you post the morning modified code ?
>>
>> Thanks
>>
>>
>>
>> On May 21, 2015, at 11:11 PM, donhoff_h <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> Thanks very much for the reply. I have tried the "SecurityUtil". I can see
>>> from log that this statement executed successfully, but I still can not
>>> pass the authentication of HBase. And with more experiments, I found a new
>>> interesting senario. If I run the program with yarn-client mode, the driver
>>> can pass the authentication, but the executors can not. If I run the
>>> program with yarn-cluster mode, both the driver and the executors can not
>>> pass the authentication. Can anybody give me some clue with this info?
>>> Many Thanks!
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "yuzhihong";<[email protected]>;
>>> 发送时间: 2015年5月22日(星期五) 凌晨5:29
>>> 收件人: "donhoff_h"<[email protected]>;
>>> 抄送: "Bill Q"<[email protected]>; "user"<[email protected]>;
>>> 主题: Re: How to use spark to access HBase with Security enabled
>>>
>>> Are the worker nodes colocated with HBase region servers ?
>>>
>>> Were you running as hbase super user ?
>>>
>>> You may need to login, using code similar to the following:
>>> if (isSecurityEnabled()) {
>>>
>>> SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost);
>>>
>>> }
>>>
>>>
>>> SecurityUtil is hadoop class.
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <[email protected]> wrote:
>>> Hi,
>>>
>>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on
>>> Yarn. According to your advice I have changed the configuration. Now my
>>> program can read the hbase-site.xml correctly. And it can also authenticate
>>> with zookeeper successfully.
>>>
>>> But I meet a new problem that is my program still can not pass the
>>> authentication of HBase. Did you or anybody else ever meet such kind of
>>> situation ? I used a keytab file to provide the principal. Since it can
>>> pass the authentication of the Zookeeper, I am sure the keytab file is OK.
>>> But it jsut can not pass the authentication of HBase. The exception is
>>> listed below and could you or anybody else help me ? Still many many thanks!
>>>
>>> ****************************Exception***************************
>>> 15/05/21 16:03:18 INFO zookeeper.ZooKeeper: Initiating client connection,
>>> connectString=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181
>>> sessionTimeout=90000 watcher=hconnection-0x4e142a710x0,
>>> quorum=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181,
>>> baseZNode=/hbase
>>> 15/05/21 16:03:18 INFO zookeeper.Login: successfully logged in.
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh thread started.
>>> 15/05/21 16:03:18 INFO client.ZooKeeperSaslClient: Client will use GSSAPI
>>> as SASL mechanism.
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Opening socket connection to
>>> server bgdt02.dev.hrb/130.1.9.98:2181. Will attempt to SASL-authenticate
>>> using Login Context section 'Client'
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Socket connection established
>>> to bgdt02.dev.hrb/130.1.9.98:2181, initiating session
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT valid starting at: Thu
>>> May 21 16:03:18 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT expires: Fri
>>> May 22 16:03:18 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh sleeping until: Fri May
>>> 22 11:43:32 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Session establishment complete
>>> on server bgdt02.dev.hrb/130.1.9.98:2181, sessionid = 0x24d46cb0ffd0020,
>>> negotiated timeout = 40000
>>> 15/05/21 16:03:18 WARN mapreduce.TableInputFormatBase: initializeTable
>>> called multiple times. Overwriting connection and table reference;
>>> TableInputFormatBase will not close these old references when done.
>>> 15/05/21 16:03:19 INFO util.RegionSizeCalculator: Calculating region sizes
>>> for table "ns_dev1:hd01".
>>> 15/05/21 16:03:19 WARN ipc.AbstractRpcClient: Exception encountered while
>>> connecting to the server : javax.security.sasl.SaslException: GSS initiate
>>> failed [Caused by GSSException: No valid credentials provided (Mechanism
>>> level: Failed to find any Kerberos tgt)]
>>> 15/05/21 16:03:19 ERROR ipc.AbstractRpcClient: SASL authentication failed.
>>> The most likely cause is missing or invalid credentials. Consider 'kinit'.
>>> javax.security.sasl.SaslException: GSS initiate failed [Caused by
>>> GSSException: No valid credentials provided (Mechanism level: Failed to
>>> find any Kerberos tgt)]
>>> at
>>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>>> at
>>> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:604)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:153)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:730)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:727)
>>> at java.security.AccessController.doPrivileged(Native
>>> Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:727)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:880)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:849)
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173)
>>> at
>>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
>>> at
>>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
>>> at
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751)
>>> at
>>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332)
>>> at
>>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187)
>>> at
>>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
>>> at
>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>>> at
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294)
>>> at
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> ***********************I aslo list my codes as below if someone can give me
>>> some advice from it*************************
>>> object TestHBaseRead {
>>> def main(args: Array[String]) {
>>> val conf = new SparkConf()
>>> val sc = new SparkContext(conf)
>>> val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration)
>>> val tbName = if(args.length==1) args(0) else "ns_dev1:hd01"
>>> hbConf.set(TableInputFormat.INPUT_TABLE,tbName)
>>> //I print the content of hbConf to check if it read the correct
>>> hbase-site.xml
>>> val it = hbConf.iterator()
>>> while(it.hasNext) {
>>> val e = it.next()
>>> println("Key="+ e.getKey +" Value="+e.getValue)
>>> }
>>>
>>> val rdd =
>>> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result])
>>> rdd.foreach(x=>{
>>> val key = x._1.toString
>>> val it = x._2.listCells().iterator()
>>> while(it.hasNext) {
>>> val c = it.next()
>>> val family = Bytes.toString(CellUtil.cloneFamily(c))
>>> val qualifier = Bytes.toString(CellUtil.cloneQualifier(c))
>>> val value = Bytes.toString(CellUtil.cloneValue(c))
>>> val tm = c.getTimestamp
>>> println("Key="+key+" Family="+family+" Qualifier="+qualifier+"
>>> Value="+value+" TimeStamp="+tm)
>>> }
>>> })
>>> sc.stop()
>>> }
>>> }
>>>
>>> ***************************I used the following command to run my
>>> program**********************
>>> spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master
>>> yarn-cluster --driver-java-options
>>> "-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas
>>> -Djava.security.krb5.conf=/etc/krb5.conf" --conf
>>> spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas
>>> -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "Bill Q";<[email protected]>;
>>> 发送时间: 2015年5月20日(星期三) 晚上10:13
>>> 收件人: "donhoff_h"<[email protected]>;
>>> 抄送: "yuzhihong"<[email protected]>; "user"<[email protected]>;
>>> 主题: Re: How to use spark to access HBase with Security enabled
>>>
>>> I have similar problem that I cannot pass the HBase configuration file as
>>> extra classpath to Spark any more using
>>> spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used
>>> to run this in 1.2 without any problem.
>>>
>>> On Tuesday, May 19, 2015, donhoff_h <[email protected]> wrote:
>>>
>>> Sorry, this ref does not help me. I have set up the configuration in
>>> hbase-site.xml. But it seems there are still some extra configurations to
>>> be set or APIs to be called to make my spark program be able to pass the
>>> authentication with the HBase.
>>>
>>> Does anybody know how to set authentication to a secured HBase in a spark
>>> program which use the API "newAPIHadoopRDD" to get information from HBase?
>>>
>>> Many Thanks!
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "yuzhihong";<[email protected]>;
>>> 发送时间: 2015年5月19日(星期二) 晚上9:54
>>> 收件人: "donhoff_h"<[email protected]>;
>>> 抄送: "user"<[email protected]>;
>>> 主题: Re: How to use spark to access HBase with Security enabled
>>>
>>> Please take a look at:
>>> http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation
>>>
>>> Cheers
>>>
>>> On Tue, May 19, 2015 at 5:23 AM, donhoff_h <[email protected]> wrote:
>>>
>>> The principal is [email protected]. It is the user that I used to run my
>>> spark programs. I am sure I have run the kinit command to make it take
>>> effect. And I also used the HBase Shell to verify that this user has the
>>> right to scan and put the tables in HBase.
>>>
>>> Now I still have no idea how to solve this problem. Can anybody help me to
>>> figure it out? Many Thanks!
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "yuzhihong";<[email protected]>;
>>> 发送时间: 2015年5月19日(星期二) 晚上7:55
>>> 收件人: "donhoff_h"<[email protected]>;
>>> 抄送: "user"<[email protected]>;
>>> 主题: Re: How to use spark to access HBase with Security enabled
>>>
>>> Which user did you run your program as ?
>>>
>>> Have you granted proper permission on hbase side ?
>>>
>>> You should also check master log to see if there was some clue.
>>>
>>> Cheers
>>>
>>>
>>>
>>> On May 19, 2015, at 2:41 AM, donhoff_h <[email protected]> wrote:
>>>
>>>> Hi, experts.
>>>>
>>>> I ran the "HBaseTest" program which is an example from the Apache Spark
>>>> source code to learn how to use spark to access HBase. But I met the
>>>> following exception:
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>>>> attempts=36, exceptions:
>>>> Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException:
>>>> callTimeout=60000, callDuration=68648: row 'spark_t01,,00000000000000' on
>>>> table 'hbase:meta' at region=hbase:meta,,1.1588230740,
>>>> hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0
>>>>
>>>> I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed in
>>>> the above exception. I found a few entries like the following one:
>>>> 2015-05-19 16:59:11,143 DEBUG
>>>> [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer:
>>>> RpcServer.listener,port=16020: Caught exception while
>>>> reading:Authentication is required
>>>>
>>>> The above entry did not point to my program clearly. But the time is very
>>>> near. Since my hbase version is HBase1.0.0 and I set security enabled, I
>>>> doubt the exception was caused by the Kerberos authentication. But I am
>>>> not sure.
>>>>
>>>> Do anybody know if my guess is right? And if I am right, could anybody
>>>> tell me how to set Kerberos Authentication in a spark program? I don't
>>>> know how to do it. I already checked the API doc , but did not found any
>>>> API useful. Many Thanks!
>>>>
>>>> By the way, my spark version is 1.3.0. I also paste the code of
>>>> "HBaseTest" in the following:
>>>> ***************************Source Code******************************
>>>> object HBaseTest {
>>>> def main(args: Array[String]) {
>>>> val sparkConf = new SparkConf().setAppName("HBaseTest")
>>>> val sc = new SparkContext(sparkConf)
>>>> val conf = HBaseConfiguration.create()
>>>> conf.set(TableInputFormat.INPUT_TABLE, args(0))
>>>>
>>>> // Initialize hBase table if necessary
>>>> val admin = new HBaseAdmin(conf)
>>>> if (!admin.isTableAvailable(args(0))) {
>>>> val tableDesc = new HTableDescriptor(args(0))
>>>> admin.createTable(tableDesc)
>>>> }
>>>>
>>>> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
>>>> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
>>>> classOf[org.apache.hadoop.hbase.client.Result])
>>>>
>>>> hBaseRDD.count()
>>>>
>>>> sc.stop()
>>>> }
>>>> }
>>>>
>>>
>>>
>>>
>>> --
>>> Many thanks.
>>>
>>>
>>> Bill
>>>
>>>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]