You could probably instead of specifying --principal [principle] --keytab [keytab] --proxy-user [proxied_user] ... arguments just create/renew a kerberos ticket before submitting a job $ kinit prinipal.name -kt keytab.file $ spark-submit ...
Do you need impersonation / proxy user at all? I thought it's primary use is for Hue and similar services which uses impersonation quite heavily in kerberized cluster. -- Ruslan Dautkhanov On Wed, Nov 4, 2015 at 1:40 PM, Ted Yu <yuzhih...@gmail.com> wrote: > 2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0] > hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - > Could not find uri with key [dfs. encryption.key.provider.uri] to > create a keyProvider !! > > Could it be related to HDFS-7931 ? > > On Wed, Nov 4, 2015 at 12:30 PM, Chen Song <chen.song...@gmail.com> wrote: > >> After a bit more investigation, I found that it could be related to >> impersonation on kerberized cluster. >> >> Our job is started with the following command. >> >> /usr/lib/spark/bin/spark-submit --master yarn-client --principal [principle] >> --keytab [keytab] --proxy-user [proxied_user] ... >> >> >> In application master's log, >> >> At start up, >> >> 2015-11-03 16:03:41,602 INFO [main] yarn.AMDelegationTokenRenewer >> (Logging.scala:logInfo(59)) - Scheduling login from keytab in 64789744 >> millis. >> >> Later on, when the delegation token renewer thread kicks in, it tries to >> re-login with the specified principle with new credentials and tries to >> write the new credentials into the over to the directory where the current >> user's credentials are stored. However, with impersonation, because the >> current user is a different user from the principle user, it fails with >> permission error. >> >> 2015-11-04 10:03:31,366 INFO [Delegation Token Refresh Thread-0] >> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Attempting to >> login to KDC using principal: principal/host@domain >> 2015-11-04 10:03:31,665 INFO [Delegation Token Refresh Thread-0] >> yarn.AMDelegationTokenRenewer (Logging.scala:logInfo(59)) - Successfully >> logged into KDC. >> 2015-11-04 10:03:31,702 INFO [Delegation Token Refresh Thread-0] >> yarn.YarnSparkHadoopUtil (Logging.scala:logInfo(59)) - getting token for >> namenode: >> hdfs://hadoop_abc/user/proxied_user/.sparkStaging/application_1443481003186_00000 >> 2015-11-04 10:03:31,904 INFO [Delegation Token Refresh Thread-0] >> hdfs.DFSClient (DFSClient.java:getDelegationToken(1025)) - Created >> HDFS_DELEGATION_TOKEN token 389283 for principal on ha-hdfs:hadoop_abc >> 2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0] >> hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - >> Could not find uri with key [dfs.encryption.key.provider.uri] to create a >> keyProvider !! >> 2015-11-04 10:03:31,944 WARN [Delegation Token Refresh Thread-0] >> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - >> PriviledgedActionException as:proxy-user (auth:SIMPLE) >> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >> Operation category READ is not supported in state standby >> 2015-11-04 10:03:31,945 WARN [Delegation Token Refresh Thread-0] ipc.Client >> (Client.java:run(675)) - Exception encountered while connecting to the >> server : >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >> Operation category READ is not supported in state standby >> 2015-11-04 10:03:31,945 WARN [Delegation Token Refresh Thread-0] >> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - >> PriviledgedActionException as:proxy-user (auth:SIMPLE) >> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >> Operation category READ is not supported in state standby >> 2015-11-04 10:03:31,963 WARN [Delegation Token Refresh Thread-0] >> yarn.YarnSparkHadoopUtil (Logging.scala:logWarning(92)) - Error while >> attempting to list files from application staging dir >> org.apache.hadoop.security.AccessControlException: Permission denied: >> user=principal, access=READ_EXECUTE, >> inode="/user/proxy-user/.sparkStaging/application_1443481003186_00000":proxy-user:proxy-user:drwx------ >> >> >> Can someone confirm my understanding is right? The class relevant is >> below, >> https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala >> >> Chen >> >> On Tue, Nov 3, 2015 at 11:57 AM, Chen Song <chen.song...@gmail.com> >> wrote: >> >>> We saw the following error happening in Spark Streaming job. Our job is >>> running on YARN with kerberos enabled. >>> >>> First, warnings below were printed out, I only pasted a few but the >>> following was repeated hundred/thousand of times. >>> >>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException >>> as:[kerberos principle] (auth:KERBEROS) >>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >>> Operation category READ is not supported in state standby >>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to >>> the server : >>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >>> Operation category READ is not supported in state standby >>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException >>> as:[kerberos principle] (auth:KERBEROS) >>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >>> Operation category READ is not supported in state standby >>> 15/11/03 14:43:07 WARN UserGroupInformation: PriviledgedActionException >>> as:[kerberos principle] (auth:KERBEROS) >>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >>> Operation category READ is not supported in state standby >>> 15/11/03 14:43:07 WARN Client: Exception encountered while connecting to >>> the server : >>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): >>> Operation category READ is not supported in state standby >>> >>> >>> It seems to have something to do with renewal of token and it tried to >>> connect a standby namenode. >>> >>> Then the following error was thrown out. >>> >>> 15/11/03 14:43:20 ERROR Utils: Uncaught exception in thread Delegation >>> Token Refresh Thread-0 >>> java.lang.StackOverflowError >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:89) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1.run(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired(ExecutorDelegationTokenUpdater.scala:79) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at >>> org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater$$anon$1$$anonfun$run$1.apply(ExecutorDelegationTokenUpdater.scala:49) >>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >>> >>> >>> Again, the above stacktrace was repeated hundreds/throusands of times. >>> That explains why a stackoverflow exception was produced. >>> >>> My question is: >>> >>> * If the HDFS active name node failed over during the job, the next time >>> token renewal is needed, the client would always need to connect with the >>> same namenode when the token was created. Is that true and expected? If so, >>> how to handle failover of namenodes for a streaming job in Spark. >>> >>> Thanks for your feedback in advance. >>> >>> -- >>> Chen Song >>> >>> >> >> >> -- >> Chen Song >> >> >