You could probably instead of specifying --principal [principle] --keytab
[keytab] --proxy-user [proxied_user] ...
arguments just create/renew a kerberos ticket before submitting a job
$ kinit prinipal.name -kt keytab.file
$ spark-submit ...

Do you need impersonation / proxy user at all? I thought it's primary use
is for Hue and similar services which uses impersonation quite heavily in
kerberized cluster.



-- 
Ruslan Dautkhanov

On Wed, Nov 4, 2015 at 1:40 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> 2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0]
> hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) -
> Could not find uri with key [dfs.     encryption.key.provider.uri] to
> create a keyProvider !!
>
> Could it be related to HDFS-7931 ?
>
> On Wed, Nov 4, 2015 at 12:30 PM, Chen Song <chen.song...@gmail.com> wrote:
>
>> After a bit more investigation, I found that it could be related to
>> impersonation on kerberized cluster.
>>
>> Our job is started with the following command.
>>
>> /usr/lib/spark/bin/spark-submit --master yarn-client --principal [principle] 
>> --keytab [keytab] --proxy-user [proxied_user] ...
>>
>>
>> In application master's log,
>>
>> At start up,
>>
>> 2015-11-03 16:03:41,602 INFO  [main] yarn.AMDelegationTokenRenewer 
>> (Logging.scala:logInfo(59)) - Scheduling login from keytab in 64789744 
>> millis.
>>
>> Later on, when the delegation token renewer thread kicks in, it tries to
>> re-login with the specified principle with new credentials and tries to
>> write the new credentials into the over to the directory where the current
>> user's credentials are stored. However, with impersonation, because the
>> current user is a different user from the principle user, it fails with
>> permission error.
>>
>> 2015-11-04 10:03:31,366 INFO  [Delegation Token Refresh Thread-0] 
>> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Attempting to 
>> login to KDC using principal: principal/host@domain
>> 2015-11-04 10:03:31,665 INFO  [Delegation Token Refresh Thread-0] 
>> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Successfully 
>> logged into KDC.
>> 2015-11-04 10:03:31,702 INFO  [Delegation Token Refresh Thread-0] 
>> yarn.YarnSparkHadoopUtil (Logging.scala:logInfo(59)) - getting token for 
>> namenode: 
>> hdfs://hadoop_abc/user/proxied_user/.sparkStaging/application_1443481003186_00000
>> 2015-11-04 10:03:31,904 INFO  [Delegation Token Refresh Thread-0] 
>> hdfs.DFSClient (DFSClient.java:getDelegationToken(1025)) - Created 
>> HDFS_DELEGATION_TOKEN token 389283 for principal on ha-hdfs:hadoop_abc
>> 2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0] 
>> hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - 
>> Could not find uri with key [dfs.encryption.key.provider.uri] to create a 
>> keyProvider !!
>> 2015-11-04 10:03:31,944 WARN  [Delegation Token Refresh Thread-0] 
>> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - 
>> PriviledgedActionException as:proxy-user (auth:SIMPLE) 
>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>  Operation category READ is not supported in state standby
>> 2015-11-04 10:03:31,945 WARN  [Delegation Token Refresh Thread-0] ipc.Client 
>> (Client.java:run(675)) - Exception encountered while connecting to the 
>> server : 
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>  Operation category READ is not supported in state standby
>> 2015-11-04 10:03:31,945 WARN  [Delegation Token Refresh Thread-0] 
>> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - 
>> PriviledgedActionException as:proxy-user (auth:SIMPLE) 
>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>  Operation category READ is not supported in state standby
>> 2015-11-04 10:03:31,963 WARN  [Delegation Token Refresh Thread-0] 
>> yarn.YarnSparkHadoopUtil (Logging.scala:logWarning(92)) - Error while 
>> attempting to list files from application staging dir
>> org.apache.hadoop.security.AccessControlException: Permission denied: 
>> user=principal, access=READ_EXECUTE, 
>> inode="/user/proxy-user/.sparkStaging/application_1443481003186_00000":proxy-user:proxy-user:drwx------
>>
>>
>> Can someone confirm my understanding is right? The class relevant is
>> below,
>> https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
>>
>> Chen
>>
>> On Tue, Nov 3, 2015 at 11:57 AM, Chen Song <chen.song...@gmail.com>
>> wrote:
>>
>>> We saw the following error happening in Spark Streaming job. Our job is
>>> running on YARN with kerberos enabled.
>>>
>>> First, warnings below were printed out, I only pasted a few but the
>>> following was repeated hundred/thousand of times.
>>>
>>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>>> as:[kerberos principle] (auth:KERBEROS)
>>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>> Operation category READ is not supported in state standby
>>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to
>>> the server :
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>> Operation category READ is not supported in state standby
>>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>>> as:[kerberos principle] (auth:KERBEROS)
>>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>> Operation category READ is not supported in state standby
>>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException
>>> as:[kerberos principle] (auth:KERBEROS)
>>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>> Operation category READ is not supported in state standby
>>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to
>>> the server :
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>> Operation category READ is not supported in state standby
>>>
>>>
>>> It seems to have something to do with renewal of token and it tried to
>>> connect a standby namenode.
>>>
>>> Then the following error was thrown out.
>>>
>>> 15/11/03 14:43:20 ERROR Utils: Uncaught exception in thread Delegation
>>> Token Refresh Thread-0
>>> java.lang.StackOverflowError
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:89)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at
>>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49)
>>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>>>
>>>
>>> Again, the above stacktrace was repeated hundreds/throusands of times.
>>> That explains why a stackoverflow exception was produced.
>>>
>>> My question is:
>>>
>>> * If the HDFS active name node failed over during the job, the next time
>>> token renewal is needed, the client would always need to connect with the
>>> same namenode when the token was created. Is that true and expected? If so,
>>> how to handle failover of namenodes for a streaming job in Spark.
>>>
>>> Thanks for your feedback in advance.
>>>
>>> --
>>> Chen Song
>>>
>>>
>>
>>
>> --
>> Chen Song
>>
>>
>

Reply via email to