You might also enable debug in: hadoop-env.sh # Extra Java runtime options. Empty by default. export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Dsun.security.krb5.debug=true ${HADOOP_OPTS}” and check that the principals are the same on the NameNode and DataNode. and you can confirm the same on all nodes in hdfs-site.xml. You can also ensure all nodes in the cluster are kerberized in core-site.xml (no auth by default) : <property> <name>hadoop.security.authentication</name> <value>kerberos</value> <description>Set the authentication for the cluster. Valid values are: simple or kerberos. </description> </property> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
Best Regards Frank > On May 22, 2015, at 4:25 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > Can you share the exception(s) you encountered ? > > Thanks > > > > On May 22, 2015, at 12:33 AM, donhoff_h <165612...@qq.com> wrote: > >> Hi, >> >> My modified code is listed below, just add the SecurityUtil API. I don't >> know which propertyKeys I should use, so I make 2 my own propertyKeys to >> find the keytab and principal. >> >> object TestHBaseRead2 { >> def main(args: Array[String]) { >> >> val conf = new SparkConf() >> val sc = new SparkContext(conf) >> val hbConf = HBaseConfiguration.create() >> hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab") >> hbConf.set("dhao.user.principal","sp...@bgdt.dev.hrb") >> SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal") >> val conn = ConnectionFactory.createConnection(hbConf) >> val tbl = conn.getTable(TableName.valueOf("spark_t01")) >> try { >> val get = new Get(Bytes.toBytes("row01")) >> val res = tbl.get(get) >> println("result:"+res.toString) >> } >> finally { >> tbl.close() >> conn.close() >> es.shutdown() >> } >> >> val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10)) >> val v = rdd.sum() >> println("Value="+v) >> sc.stop() >> >> } >> } >> >> >> ------------------ 原始邮件 ------------------ >> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >> 发送时间: 2015年5月22日(星期五) 下午3:25 >> 收件人: "donhoff_h"<165612...@qq.com>; >> 抄送: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; >> 主题: Re: 回复: How to use spark to access HBase with Security enabled >> >> Can you post the morning modified code ? >> >> Thanks >> >> >> >> On May 21, 2015, at 11:11 PM, donhoff_h <165612...@qq.com> wrote: >> >>> Hi, >>> >>> Thanks very much for the reply. I have tried the "SecurityUtil". I can see >>> from log that this statement executed successfully, but I still can not >>> pass the authentication of HBase. And with more experiments, I found a new >>> interesting senario. If I run the program with yarn-client mode, the driver >>> can pass the authentication, but the executors can not. If I run the >>> program with yarn-cluster mode, both the driver and the executors can not >>> pass the authentication. Can anybody give me some clue with this info? >>> Many Thanks! >>> >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >>> 发送时间: 2015年5月22日(星期五) 凌晨5:29 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> Are the worker nodes colocated with HBase region servers ? >>> >>> Were you running as hbase super user ? >>> >>> You may need to login, using code similar to the following: >>> if (isSecurityEnabled()) { >>> >>> SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost); >>> >>> } >>> >>> >>> SecurityUtil is hadoop class. >>> >>> >>> >>> Cheers >>> >>> >>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <165612...@qq.com> wrote: >>> Hi, >>> >>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on >>> Yarn. According to your advice I have changed the configuration. Now my >>> program can read the hbase-site.xml correctly. And it can also authenticate >>> with zookeeper successfully. >>> >>> But I meet a new problem that is my program still can not pass the >>> authentication of HBase. Did you or anybody else ever meet such kind of >>> situation ? I used a keytab file to provide the principal. Since it can >>> pass the authentication of the Zookeeper, I am sure the keytab file is OK. >>> But it jsut can not pass the authentication of HBase. The exception is >>> listed below and could you or anybody else help me ? Still many many thanks! >>> >>> ****************************Exception*************************** >>> 15/05/21 16:03:18 INFO zookeeper.ZooKeeper: Initiating client connection, >>> connectString=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181 >>> sessionTimeout=90000 watcher=hconnection-0x4e142a710x0, >>> quorum=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181, >>> baseZNode=/hbase >>> 15/05/21 16:03:18 INFO zookeeper.Login: successfully logged in. >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh thread started. >>> 15/05/21 16:03:18 INFO client.ZooKeeperSaslClient: Client will use GSSAPI >>> as SASL mechanism. >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Opening socket connection to >>> server bgdt02.dev.hrb/130.1.9.98:2181. Will attempt to SASL-authenticate >>> using Login Context section 'Client' >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Socket connection established >>> to bgdt02.dev.hrb/130.1.9.98:2181, initiating session >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT valid starting at: Thu >>> May 21 16:03:18 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT expires: Fri >>> May 22 16:03:18 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh sleeping until: Fri May >>> 22 11:43:32 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Session establishment complete >>> on server bgdt02.dev.hrb/130.1.9.98:2181, sessionid = 0x24d46cb0ffd0020, >>> negotiated timeout = 40000 >>> 15/05/21 16:03:18 WARN mapreduce.TableInputFormatBase: initializeTable >>> called multiple times. Overwriting connection and table reference; >>> TableInputFormatBase will not close these old references when done. >>> 15/05/21 16:03:19 INFO util.RegionSizeCalculator: Calculating region sizes >>> for table "ns_dev1:hd01". >>> 15/05/21 16:03:19 WARN ipc.AbstractRpcClient: Exception encountered while >>> connecting to the server : javax.security.sasl.SaslException: GSS initiate >>> failed [Caused by GSSException: No valid credentials provided (Mechanism >>> level: Failed to find any Kerberos tgt)] >>> 15/05/21 16:03:19 ERROR ipc.AbstractRpcClient: SASL authentication failed. >>> The most likely cause is missing or invalid credentials. Consider 'kinit'. >>> javax.security.sasl.SaslException: GSS initiate failed [Caused by >>> GSSException: No valid credentials provided (Mechanism level: Failed to >>> find any Kerberos tgt)] >>> at >>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) >>> at >>> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:604) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:153) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:730) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:727) >>> at java.security.AccessController.doPrivileged(Native >>> Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:727) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:880) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:849) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173) >>> at >>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) >>> at >>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) >>> at >>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) >>> at >>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> ***********************I aslo list my codes as below if someone can give me >>> some advice from it************************* >>> object TestHBaseRead { >>> def main(args: Array[String]) { >>> val conf = new SparkConf() >>> val sc = new SparkContext(conf) >>> val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration) >>> val tbName = if(args.length==1) args(0) else "ns_dev1:hd01" >>> hbConf.set(TableInputFormat.INPUT_TABLE,tbName) >>> //I print the content of hbConf to check if it read the correct >>> hbase-site.xml >>> val it = hbConf.iterator() >>> while(it.hasNext) { >>> val e = it.next() >>> println("Key="+ e.getKey +" Value="+e.getValue) >>> } >>> >>> val rdd = >>> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result]) >>> rdd.foreach(x=>{ >>> val key = x._1.toString >>> val it = x._2.listCells().iterator() >>> while(it.hasNext) { >>> val c = it.next() >>> val family = Bytes.toString(CellUtil.cloneFamily(c)) >>> val qualifier = Bytes.toString(CellUtil.cloneQualifier(c)) >>> val value = Bytes.toString(CellUtil.cloneValue(c)) >>> val tm = c.getTimestamp >>> println("Key="+key+" Family="+family+" Qualifier="+qualifier+" >>> Value="+value+" TimeStamp="+tm) >>> } >>> }) >>> sc.stop() >>> } >>> } >>> >>> ***************************I used the following command to run my >>> program********************** >>> spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master >>> yarn-cluster --driver-java-options >>> "-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >>> -Djava.security.krb5.conf=/etc/krb5.conf" --conf >>> spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >>> -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "Bill Q";<bill.q....@gmail.com>; >>> 发送时间: 2015年5月20日(星期三) 晚上10:13 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "yuzhihong"<yuzhih...@gmail.com>; "user"<user@spark.apache.org>; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> I have similar problem that I cannot pass the HBase configuration file as >>> extra classpath to Spark any more using >>> spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used >>> to run this in 1.2 without any problem. >>> >>> On Tuesday, May 19, 2015, donhoff_h <165612...@qq.com> wrote: >>> >>> Sorry, this ref does not help me. I have set up the configuration in >>> hbase-site.xml. But it seems there are still some extra configurations to >>> be set or APIs to be called to make my spark program be able to pass the >>> authentication with the HBase. >>> >>> Does anybody know how to set authentication to a secured HBase in a spark >>> program which use the API "newAPIHadoopRDD" to get information from HBase? >>> >>> Many Thanks! >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >>> 发送时间: 2015年5月19日(星期二) 晚上9:54 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "user"<user@spark.apache.org>; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> Please take a look at: >>> http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation >>> >>> Cheers >>> >>> On Tue, May 19, 2015 at 5:23 AM, donhoff_h <165612...@qq.com> wrote: >>> >>> The principal is sp...@bgdt.dev.hrb. It is the user that I used to run my >>> spark programs. I am sure I have run the kinit command to make it take >>> effect. And I also used the HBase Shell to verify that this user has the >>> right to scan and put the tables in HBase. >>> >>> Now I still have no idea how to solve this problem. Can anybody help me to >>> figure it out? Many Thanks! >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >>> 发送时间: 2015年5月19日(星期二) 晚上7:55 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "user"<user@spark.apache.org>; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> Which user did you run your program as ? >>> >>> Have you granted proper permission on hbase side ? >>> >>> You should also check master log to see if there was some clue. >>> >>> Cheers >>> >>> >>> >>> On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com> wrote: >>> >>>> Hi, experts. >>>> >>>> I ran the "HBaseTest" program which is an example from the Apache Spark >>>> source code to learn how to use spark to access HBase. But I met the >>>> following exception: >>>> Exception in thread "main" >>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >>>> attempts=36, exceptions: >>>> Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: >>>> callTimeout=60000, callDuration=68648: row 'spark_t01,,00000000000000' on >>>> table 'hbase:meta' at region=hbase:meta,,1.1588230740, >>>> hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0 >>>> >>>> I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed in >>>> the above exception. I found a few entries like the following one: >>>> 2015-05-19 16:59:11,143 DEBUG >>>> [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer: >>>> RpcServer.listener,port=16020: Caught exception while >>>> reading:Authentication is required >>>> >>>> The above entry did not point to my program clearly. But the time is very >>>> near. Since my hbase version is HBase1.0.0 and I set security enabled, I >>>> doubt the exception was caused by the Kerberos authentication. But I am >>>> not sure. >>>> >>>> Do anybody know if my guess is right? And if I am right, could anybody >>>> tell me how to set Kerberos Authentication in a spark program? I don't >>>> know how to do it. I already checked the API doc , but did not found any >>>> API useful. Many Thanks! >>>> >>>> By the way, my spark version is 1.3.0. I also paste the code of >>>> "HBaseTest" in the following: >>>> ***************************Source Code****************************** >>>> object HBaseTest { >>>> def main(args: Array[String]) { >>>> val sparkConf = new SparkConf().setAppName("HBaseTest") >>>> val sc = new SparkContext(sparkConf) >>>> val conf = HBaseConfiguration.create() >>>> conf.set(TableInputFormat.INPUT_TABLE, args(0)) >>>> >>>> // Initialize hBase table if necessary >>>> val admin = new HBaseAdmin(conf) >>>> if (!admin.isTableAvailable(args(0))) { >>>> val tableDesc = new HTableDescriptor(args(0)) >>>> admin.createTable(tableDesc) >>>> } >>>> >>>> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], >>>> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], >>>> classOf[org.apache.hadoop.hbase.client.Result]) >>>> >>>> hBaseRDD.count() >>>> >>>> sc.stop() >>>> } >>>> } >>>> >>> >>> >>> >>> -- >>> Many thanks. >>> >>> >>> Bill >>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org