2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0]
hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) -
Could not find uri with key [dfs.     encryption.key.provider.uri] to
create a keyProvider !!

Could it be related to HDFS-7931 ?

On Wed, Nov 4, 2015 at 12:30 PM, Chen Song <chen.song...@gmail.com> wrote:

> After a bit more investigation, I found that it could be related to
> impersonation on kerberized cluster.
>
> Our job is started with the following command.
>
> /usr/lib/spark/bin/spark-submit --master yarn-client --principal [principle] 
> --keytab [keytab] --proxy-user [proxied_user] ...
>
>
> In application master's log,
>
> At start up,
>
> 2015-11-03 16:03:41,602 INFO  [main] yarn.AMDelegationTokenRenewer 
> (Logging.scala:logInfo(59)) - Scheduling login from keytab in 64789744 millis.
>
> Later on, when the delegation token renewer thread kicks in, it tries to
> re-login with the specified principle with new credentials and tries to
> write the new credentials into the over to the directory where the current
> user's credentials are stored. However, with impersonation, because the
> current user is a different user from the principle user, it fails with
> permission error.
>
> 2015-11-04 10:03:31,366 INFO  [Delegation Token Refresh Thread-0] 
> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Attempting to 
> login to KDC using principal: principal/host@domain
> 2015-11-04 10:03:31,665 INFO  [Delegation Token Refresh Thread-0] 
> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Successfully 
> logged into KDC.
> 2015-11-04 10:03:31,702 INFO  [Delegation Token Refresh Thread-0] 
> yarn.YarnSparkHadoopUtil (Logging.scala:logInfo(59)) - getting token for 
> namenode: 
> hdfs://hadoop_abc/user/proxied_user/.sparkStaging/application_1443481003186_00000
> 2015-11-04 10:03:31,904 INFO  [Delegation Token Refresh Thread-0] 
> hdfs.DFSClient (DFSClient.java:getDelegationToken(1025)) - Created 
> HDFS_DELEGATION_TOKEN token 389283 for principal on ha-hdfs:hadoop_abc
> 2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0] 
> hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - 
> Could not find uri with key [dfs.encryption.key.provider.uri] to create a 
> keyProvider !!
> 2015-11-04 10:03:31,944 WARN  [Delegation Token Refresh Thread-0] 
> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - 
> PriviledgedActionException as:proxy-user (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
> 2015-11-04 10:03:31,945 WARN  [Delegation Token Refresh Thread-0] ipc.Client 
> (Client.java:run(675)) - Exception encountered while connecting to the server 
> : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
> 2015-11-04 10:03:31,945 WARN  [Delegation Token Refresh Thread-0] 
> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - 
> PriviledgedActionException as:proxy-user (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
> 2015-11-04 10:03:31,963 WARN  [Delegation Token Refresh Thread-0] 
> yarn.YarnSparkHadoopUtil (Logging.scala:logWarning(92)) - Error while 
> attempting to list files from application staging dir
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=principal, access=READ_EXECUTE, 
> inode="/user/proxy-user/.sparkStaging/application_1443481003186_00000":proxy-user:proxy-user:drwx------
>
>
> Can someone confirm my understanding is right? The class relevant is
> below,
> https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
>
> Chen
>
> On Tue, Nov 3, 2015 at 11:57 AM, Chen Song <chen.song...@gmail.com> wrote:
>
>> We saw the following error happening in Spark Streaming job. Our job is
>> running on YARN with kerberos enabled.
>>
>> First, warnings below were printed out, I only pasted a few but the
>> following was repeated hundred/thousand of times.
>>
>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>> as:[kerberos principle] (auth:KERBEROS)
>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to
>> the server :
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>> as:[kerberos principle] (auth:KERBEROS)
>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>> as:[kerberos principle] (auth:KERBEROS)
>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to
>> the server :
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>>
>>
>> It seems to have something to do with renewal of token and it tried to
>> connect a standby namenode.
>>
>> Then the following error was thrown out.
>>
>> 15/11/03 14:43:20 ERROR Utils: Uncaught exception in thread Delegation
>> Token Refresh Thread-0
>> java.lang.StackOverflowError
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:89)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at
>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>>
>>
>> Again, the above stacktrace was repeated hundreds/throusands of times.
>> That explains why a stackoverflow exception was produced.
>>
>> My question is:
>>
>> * If the HDFS active name node failed over during the job, the next time
>> token renewal is needed, the client would always need to connect with the
>> same namenode when the token was created. Is that true and expected? If so,
>> how to handle failover of namenodes for a streaming job in Spark.
>>
>> Thanks for your feedback in advance.
>>
>> --
>> Chen Song
>>
>>
>
>
> --
> Chen Song
>
>

Reply via email to