Hello, I am trying to understand Spark support to access secure HDFS cluster.
My plan is to deploy "Spark on Mesos" which will access a secure HDFS cluster running elsewhere in the network. I am trying to understand how much of support do exist as of now? My understanding is Spark as of now supports accessing secured Hadoop cluster only through "Spark on YARN" deployment option where the principal and keytab can be passed through spark-submit options. However, spark "local" mode also accepts keytab and principal to support apps like spark sql. I have installed Spark (1.6.1) on a machine where Hadoop client is not installed but copied core-site and hdfs-site to /etc/hadoop/conf and configured HADOOP_CONF_DIR="/etc/hadoop/conf/" property in spark-env. I have tested and confirmed "org.apache.spark.examples.HdfsTest" class can access insecure Hadoop cluster. When I test the same code/configuration with a secure Hadoop cluster, I am getting *"Can't get Master Kerberos principal for use as renewer" error message. * I have pasted complete debug log output below. Please let me know if I am missing any configurations that is causing this issue? Regards Vijay /bin/spark-submit --deploy-mode client --master local --driver-memory=512M --driver-cores=0.5 --executor-memory 512M --total-executor-cores=1 --principal hdf...@foo.com --keytab /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytab -v --class org.apache.spark.examples.HdfsTest lib/spark-examples-1.6.1-hadoop2.6.0.jar /sink/names Using properties file: /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.conf Adding default property: spark.executor.uri=http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz Parsed arguments: master local deployMode client executorMemory 512M executorCores null totalExecutorCores 1 propertiesFile /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.conf driverMemory 512M driverCores 0.5 driverExtraClassPath null driverExtraLibraryPath null driverExtraJavaOptions null supervise false queue null numExecutors null files null pyFiles null archives null mainClass org.apache.spark.examples.HdfsTest primaryResource file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar name org.apache.spark.examples.HdfsTest childArgs [/sink/names] jars null packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.conf: spark.driver.memory -> 512M spark.executor.uri -> http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz 16/04/12 22:54:06 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of successful kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops) 16/04/12 22:54:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of failed kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops) 16/04/12 22:54:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[GetGroups], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops) 16/04/12 22:54:07 DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics 16/04/12 22:54:07 DEBUG Groups: Creating new Groups object 16/04/12 22:54:07 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library... 16/04/12 22:54:07 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 16/04/12 22:54:07 DEBUG NativeCodeLoader: java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib 16/04/12 22:54:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/04/12 22:54:07 DEBUG PerformanceAdvisory: Falling back to shell based 16/04/12 22:54:07 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 16/04/12 22:54:07 DEBUG Shell: Failed to detect a valid hadoop home directory java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:302) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:327) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104) at org.apache.hadoop.security.Groups.<init>(Groups.java:86) at org.apache.hadoop.security.Groups.<init>(Groups.java:66) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248) at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:325) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:319) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:914) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:564) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/04/12 22:54:07 DEBUG Shell: setsid exited with exit code 0 16/04/12 22:54:07 DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 16/04/12 22:54:07 DEBUG UserGroupInformation: hadoop login 16/04/12 22:54:07 DEBUG UserGroupInformation: hadoop login commit 16/04/12 22:54:07 DEBUG UserGroupInformation: using kerberos user:hdf...@foo.com 16/04/12 22:54:07 DEBUG UserGroupInformation: Using user: "hdf...@foo.com" with name hdf...@foo.com 16/04/12 22:54:07 DEBUG UserGroupInformation: User entry: "hdf...@foo.com" 16/04/12 22:54:07 INFO UserGroupInformation: Login successful for user hdf...@foo.com using keytab file /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytab Main class: org.apache.spark.examples.HdfsTest Arguments: /sink/names System properties: spark.driver.memory -> 512M SPARK_SUBMIT -> true spark.yarn.keytab -> /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytab spark.yarn.principal -> hdf...@foo.com spark.app.name -> org.apache.spark.examples.HdfsTest spark.executor.uri -> http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz spark.jars -> file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar spark.submit.deployMode -> client spark.master -> local Classpath elements: file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar 16/04/12 22:54:07 INFO SparkContext: Running Spark version 1.6.1 16/04/12 22:54:07 INFO SecurityManager: Changing view acls to: root,hdfs 16/04/12 22:54:07 INFO SecurityManager: Changing modify acls to: root,hdfs 16/04/12 22:54:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, hdfs); users with modify permissions: Set(root, hdfs) 16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified 16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified 16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified 16/04/12 22:54:07 DEBUG SecurityManager: SSLConfiguration for file server: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()} 16/04/12 22:54:07 DEBUG SecurityManager: SSLConfiguration for Akka: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()} 16/04/12 22:54:08 DEBUG InternalLoggerFactory: Using SLF4J as the default logging framework 16/04/12 22:54:08 DEBUG PlatformDependent0: java.nio.Buffer.address: available 16/04/12 22:54:08 DEBUG PlatformDependent0: sun.misc.Unsafe.theUnsafe: available 16/04/12 22:54:08 DEBUG PlatformDependent0: sun.misc.Unsafe.copyMemory: available 16/04/12 22:54:08 DEBUG PlatformDependent0: java.nio.Bits.unaligned: true 16/04/12 22:54:08 DEBUG PlatformDependent: Java version: 7 16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noUnsafe: false 16/04/12 22:54:08 DEBUG PlatformDependent: sun.misc.Unsafe: available 16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noJavassist: false 16/04/12 22:54:08 DEBUG PlatformDependent: Javassist: unavailable 16/04/12 22:54:08 DEBUG PlatformDependent: You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes. Please check the configuration for better performance. 16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp (java.io.tmpdir) 16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.bitMode: 64 (sun.arch.data.model) 16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noPreferDirect: false 16/04/12 22:54:08 DEBUG MultithreadEventLoopGroup: -Dio.netty.eventLoopThreads: 4 16/04/12 22:54:08 DEBUG NioEventLoop: -Dio.netty.noKeySetOptimization: false 16/04/12 22:54:08 DEBUG NioEventLoop: -Dio.netty.selectorAutoRebuildThreshold: 512 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numHeapArenas: 4 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numDirectArenas: 4 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.pageSize: 8192 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxOrder: 11 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.chunkSize: 16777216 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.tinyCacheSize: 512 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.smallCacheSize: 256 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.normalCacheSize: 64 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxCachedBufferCapacity: 32768 16/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.cacheTrimInterval: 8192 16/04/12 22:54:08 DEBUG ThreadLocalRandom: -Dio.netty.initialSeedUniquifier: 0x724fd652e2864d0d (took 1 ms) 16/04/12 22:54:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type: unpooled 16/04/12 22:54:08 DEBUG ByteBufUtil: -Dio.netty.threadLocalDirectBufferSize: 65536 16/04/12 22:54:08 DEBUG NetUtil: Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%1) 16/04/12 22:54:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128 16/04/12 22:54:08 DEBUG TransportServer: Shuffle server started on port :34022 16/04/12 22:54:08 INFO Utils: Successfully started service 'sparkDriver' on port 34022. 16/04/12 22:54:08 DEBUG AkkaUtils: In createActorSystem, requireCookie is: off 16/04/12 22:54:08 INFO Slf4jLogger: Slf4jLogger started 16/04/12 22:54:08 INFO Remoting: Starting remoting 16/04/12 22:54:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.24.100.194:49829] 16/04/12 22:54:09 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 49829. 16/04/12 22:54:09 DEBUG SparkEnv: Using serializer: class org.apache.spark.serializer.JavaSerializer 16/04/12 22:54:09 INFO SparkEnv: Registering MapOutputTracker 16/04/12 22:54:09 INFO SparkEnv: Registering BlockManagerMaster 16/04/12 22:54:09 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-9421ede1-5bc9-4170-9030-359d7e6a3686 16/04/12 22:54:09 INFO MemoryStore: MemoryStore started with capacity 143.6 MB 16/04/12 22:54:09 INFO SparkEnv: Registering OutputCommitCoordinator 16/04/12 22:54:09 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/04/12 22:54:09 INFO SparkUI: Started SparkUI at http://10.24.100.194:4040 16/04/12 22:54:09 INFO HttpFileServer: HTTP File server directory is /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf/httpd-d86e703b-a499-49a6-a2df-0cca5628bdf6 16/04/12 22:54:09 INFO HttpServer: Starting HTTP Server 16/04/12 22:54:09 DEBUG HttpServer: HttpServer is not using security 16/04/12 22:54:09 INFO Utils: Successfully started service 'HTTP file server' on port 35075. 16/04/12 22:54:09 DEBUG HttpFileServer: HTTP file server started at: http://10.24.100.194:35075 16/04/12 22:54:10 INFO SparkContext: Added JAR file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar at http://10.247.10.194:35075/jars/spark-examples-1.6.1-hadoop2.6.0.jar with timestamp 1460501650090 16/04/12 22:54:10 INFO Executor: Starting executor ID driver on host localhost 16/04/12 22:54:10 DEBUG TransportServer: Shuffle server started on port :42782 16/04/12 22:54:10 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42782. 16/04/12 22:54:10 INFO NettyBlockTransferService: Server created on 42782 16/04/12 22:54:10 INFO BlockManagerMaster: Trying to register BlockManager 16/04/12 22:54:10 INFO BlockManagerMasterEndpoint: Registering block manager localhost:42782 with 143.6 MB RAM, BlockManagerId(driver, localhost, 42782) 16/04/12 22:54:10 INFO BlockManagerMaster: Registered BlockManager 16/04/12 22:54:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 87.3 KB, free 87.3 KB) 16/04/12 22:54:10 DEBUG BlockManager: Put block broadcast_0 locally took 113 ms 16/04/12 22:54:10 DEBUG BlockManager: Putting block broadcast_0 without replication took 114 ms 16/04/12 22:54:10 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.4 KB, free 101.8 KB) 16/04/12 22:54:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:42782 (size: 14.4 KB, free: 143.6 MB) 16/04/12 22:54:10 DEBUG BlockManagerMaster: Updated info of block broadcast_0_piece0 16/04/12 22:54:10 DEBUG BlockManager: Told master about block broadcast_0_piece0 16/04/12 22:54:10 DEBUG BlockManager: Put block broadcast_0_piece0 locally took 5 ms 16/04/12 22:54:10 DEBUG BlockManager: Putting block broadcast_0_piece0 without replication took 6 ms 16/04/12 22:54:10 INFO SparkContext: Created broadcast 0 from textFile at HdfsTest.scala:34 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33) +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.serialVersionUID 16/04/12 22:54:10 DEBUG ClosureCleaner: private final org.apache.spark.SparkContext$$anonfun$hadoopFile$1 org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.$outer 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(java.lang.Object) 16/04/12 22:54:10 DEBUG ClosureCleaner: public final void org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(org.apache.hadoop.mapred.JobConf) 16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext$$anonfun$hadoopFile$1 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: <function0> 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext@b6b4262 16/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext$$anonfun$hadoopFile$1,Set(path$6)) 16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext,Set()) 16/04/12 22:54:10 DEBUG ClosureCleaner: + outermost object is not a closure, so do not clone it: (class org.apache.spark.SparkContext,org.apache.spark.SparkContext@b6b4262) 16/04/12 22:54:10 DEBUG ClosureCleaner: + cloning the object <function0> of class org.apache.spark.SparkContext$$anonfun$hadoopFile$1 16/04/12 22:54:10 DEBUG ClosureCleaner: + cleaning cloned closure <function0> recursively (org.apache.spark.SparkContext$$anonfun$hadoopFile$1) 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function0> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1) +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 7 16/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$hadoopFile$1.serialVersionUID 16/04/12 22:54:10 DEBUG ClosureCleaner: private final org.apache.spark.SparkContext org.apache.spark.SparkContext$$anonfun$hadoopFile$1.$outer 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.String org.apache.spark.SparkContext$$anonfun$hadoopFile$1.path$6 16/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.inputFormatClass$1 16/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.keyClass$1 16/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.valueClass$1 16/04/12 22:54:10 DEBUG ClosureCleaner: private final int org.apache.spark.SparkContext$$anonfun$hadoopFile$1.minPartitions$3 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply() 16/04/12 22:54:10 DEBUG ClosureCleaner: public final org.apache.spark.rdd.HadoopRDD org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply() 16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext@b6b4262 16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext$$anonfun$hadoopFile$1,Set(path$6)) 16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext,Set()) 16/04/12 22:54:10 DEBUG ClosureCleaner: + outermost object is not a closure, so do not clone it: (class org.apache.spark.SparkContext,org.apache.spark.SparkContext@b6b4262) 16/04/12 22:54:10 DEBUG ClosureCleaner: + the starting closure doesn't actually need org.apache.spark.SparkContext@b6b4262, so we null it out 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function0> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1) is now cleaned +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33) is now cleaned +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9) +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.serialVersionUID 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.apply(java.lang.Object) 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.String org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.apply(scala.Tuple2) 16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects! 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9) is now cleaned +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$1) +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.examples.HdfsTest$$anonfun$1.serialVersionUID 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 2 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.examples.HdfsTest$$anonfun$1.apply(java.lang.Object) 16/04/12 22:54:10 DEBUG ClosureCleaner: public final int org.apache.spark.examples.HdfsTest$$anonfun$1.apply(java.lang.String) 16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects! 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$1) is now cleaned +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1) +++ 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 1 16/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.serialVersionUID 16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 3 16/04/12 22:54:10 DEBUG ClosureCleaner: public void org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(int) 16/04/12 22:54:10 DEBUG ClosureCleaner: public final void org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(int) 16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(java.lang.Object) 16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 16/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects! 16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1) is now cleaned +++ 16/04/12 22:54:10 DEBUG BlockManager: Getting local block broadcast_0 16/04/12 22:54:10 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1) 16/04/12 22:54:10 DEBUG BlockManager: Getting block broadcast_0 from memory 16/04/12 22:54:10 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use 16/04/12 22:54:11 DEBUG : address: lglop194/10.24.100.194 isLoopbackAddress: false, with host 10.24.100.194 lglop194 16/04/12 22:54:11 DEBUG NativeLibraryLoader: -Dio.netty.tmpdir: /tmp (java.io.tmpdir) 16/04/12 22:54:11 DEBUG NativeLibraryLoader: -Dio.netty.netty.workdir: /tmp (io.netty.tmpdir) 16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = true 16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket 16/04/12 22:54:11 DEBUG DFSClient: No KeyProvider found. 16/04/12 22:54:11 DEBUG RetryUtils: multipleLinearRandomRetry = null 16/04/12 22:54:11 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@8001a94 16/04/12 22:54:11 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@12ffa30e 16/04/12 22:54:11 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 16/04/12 22:54:11 DEBUG DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection Exception in thread "main" java.io.IOException: Can't get Master Kerberos principal for use as renewer at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:205) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912) at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.foreach(RDD.scala:910) at org.apache.spark.examples.HdfsTest$$anonfun$main$1.apply$mcVI$sp(HdfsTest.scala:38) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.examples.HdfsTest$.main(HdfsTest.scala:36) at org.apache.spark.examples.HdfsTest.main(HdfsTest.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/04/12 22:54:11 INFO SparkContext: Invoking stop() from shutdown hook 16/04/12 22:54:11 INFO SparkUI: Stopped Spark web UI at http://10.24.100.194:4040 16/04/12 22:54:11 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/04/12 22:54:11 INFO MemoryStore: MemoryStore cleared 16/04/12 22:54:11 INFO BlockManager: BlockManager stopped 16/04/12 22:54:11 INFO BlockManagerMaster: BlockManagerMaster stopped 16/04/12 22:54:11 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/04/12 22:54:11 INFO SparkContext: Successfully stopped SparkContext 16/04/12 22:54:11 INFO ShutdownHookManager: Shutdown hook called 16/04/12 22:54:11 INFO ShutdownHookManager: Deleting directory /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf 16/04/12 22:54:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/04/12 22:54:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 16/04/12 22:54:11 INFO ShutdownHookManager: Deleting directory /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf/httpd-d86e703b-a499-49a6-a2df-0cca5628bdf6 16/04/12 22:54:11 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client@12ffa30e 16/04/12 22:54:11 DEBUG Client: removing client from cache: org.apache.hadoop.ipc.Client@12ffa30e 16/04/12 22:54:11 DEBUG Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@12ffa30e 16/04/12 22:54:11 DEBUG Client: Stopping client -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-acessing-secured-HDFS-tp26766.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org