Hello, I am trying to understand Spark support to access secure HDFS cluster. My plan is to deploy Spark on Mesos which will access a secure HDFS cluster running elsewhere in the network. I am trying to understand how much of support do exist as of now? My understanding is Spark as of now supports accessing secured Hadoop cluster only through "Spark on YARN" deployment option where the principal and keytab can be passed through spark-submit options. However, spark "local" mode also accepts keytab and principal to support apps like spark sql. I have installed Spark (1.6.1) on a machine where Hadoop client is not installed but copied core-site and hdfs-site to /etc/hadoop/conf and configured HADOOP_CONF_DIR="/etc/hadoop/conf/" property in spark-env. I have tested and confirmed "org.apache.spark.examples.HdfsTest" class can access insecure Hadoop cluster. When I test the same code/configuration with a secure Hadoop cluster, I am getting "Can't get Master Kerberos principal for use as renewer" error message. I have pasted complete debug log output below. Please let me know if I am missing any configurations that is causing this issue? RegardsVijay /bin/spark-submit --deploy-mode client --master local --driver-memory=512M --driver-cores=0.5 --executor-memory 512M --total-executor-cores=1 --principal hdf...@foo.com --keytab /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytab -v --class org.apache.spark.examples.HdfsTest lib/spark-examples-1.6.1-hadoop2.6.0.jar /sink/names Using properties file: /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.confAdding default property: spark.executor.uri=http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgzParsed arguments: master local deployMode client executorMemory 512M executorCores null totalExecutorCores 1 propertiesFile /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.conf driverMemory 512M driverCores 0.5 driverExtraClassPath null driverExtraLibraryPath null driverExtraJavaOptions null supervise false queue null numExecutors null files null pyFiles null archives null mainClass org.apache.spark.examples.HdfsTest primaryResource file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar name org.apache.spark.examples.HdfsTest childArgs [/sink/names] jars null packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file /downloads/spark-1.6.1-bin-hadoop2.6/conf/spark-defaults.conf: spark.driver.memory -> 512M spark.executor.uri -> http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz
16/04/12 22:54:06 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of successful kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)16/04/12 22:54:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of failed kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)16/04/12 22:54:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[GetGroups], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)16/04/12 22:54:07 DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics16/04/12 22:54:07 DEBUG Groups: Creating new Groups object16/04/12 22:54:07 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...16/04/12 22:54:07 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path16/04/12 22:54:07 DEBUG NativeCodeLoader: java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib16/04/12 22:54:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable16/04/12 22:54:07 DEBUG PerformanceAdvisory: Falling back to shell based16/04/12 22:54:07 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping16/04/12 22:54:07 DEBUG Shell: Failed to detect a valid hadoop home directoryjava.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:302) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:327) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104) at org.apache.hadoop.security.Groups.<init>(Groups.java:86) at org.apache.hadoop.security.Groups.<init>(Groups.java:66) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248) at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:325) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:319) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:914) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:564) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)16/04/12 22:54:07 DEBUG Shell: setsid exited with exit code 016/04/12 22:54:07 DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=500016/04/12 22:54:07 DEBUG UserGroupInformation: hadoop login16/04/12 22:54:07 DEBUG UserGroupInformation: hadoop login commit16/04/12 22:54:07 DEBUG UserGroupInformation: using kerberos user:hdfs-c@FOO.COM16/04/12 22:54:07 DEBUG UserGroupInformation: Using user: "hdf...@foo.com" with name hdfs-c@FOO.COM16/04/12 22:54:07 DEBUG UserGroupInformation: User entry: "hdf...@foo.com"16/04/12 22:54:07 INFO UserGroupInformation: Login successful for user hdf...@foo.com using keytab file /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytabMain class:org.apache.spark.examples.HdfsTestArguments:/sink/namesSystem properties:spark.driver.memory -> 512MSPARK_SUBMIT -> truespark.yarn.keytab -> /downloads/spark-1.6.1-bin-hadoop2.6/hdfs.keytabspark.yarn.principal -> hdf...@foo.comspark.app.name -> org.apache.spark.examples.HdfsTestspark.executor.uri -> http://shinyfeather.com/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgzspark.jars -> file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jarspark.submit.deployMode -> clientspark.master -> localClasspath elements:file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar 16/04/12 22:54:07 INFO SparkContext: Running Spark version 1.6.116/04/12 22:54:07 INFO SecurityManager: Changing view acls to: root,hdfs16/04/12 22:54:07 INFO SecurityManager: Changing modify acls to: root,hdfs16/04/12 22:54:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, hdfs); users with modify permissions: Set(root, hdfs)16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified16/04/12 22:54:07 DEBUG SSLOptions: No SSL protocol specified16/04/12 22:54:07 DEBUG SecurityManager: SSLConfiguration for file server: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}16/04/12 22:54:07 DEBUG SecurityManager: SSLConfiguration for Akka: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}16/04/12 22:54:08 DEBUG InternalLoggerFactory: Using SLF4J as the default logging framework16/04/12 22:54:08 DEBUG PlatformDependent0: java.nio.Buffer.address: available16/04/12 22:54:08 DEBUG PlatformDependent0: sun.misc.Unsafe.theUnsafe: available16/04/12 22:54:08 DEBUG PlatformDependent0: sun.misc.Unsafe.copyMemory: available16/04/12 22:54:08 DEBUG PlatformDependent0: java.nio.Bits.unaligned: true16/04/12 22:54:08 DEBUG PlatformDependent: Java version: 716/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noUnsafe: false16/04/12 22:54:08 DEBUG PlatformDependent: sun.misc.Unsafe: available16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noJavassist: false16/04/12 22:54:08 DEBUG PlatformDependent: Javassist: unavailable16/04/12 22:54:08 DEBUG PlatformDependent: You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes. Please check the configuration for better performance.16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp (java.io.tmpdir)16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.bitMode: 64 (sun.arch.data.model)16/04/12 22:54:08 DEBUG PlatformDependent: -Dio.netty.noPreferDirect: false16/04/12 22:54:08 DEBUG MultithreadEventLoopGroup: -Dio.netty.eventLoopThreads: 416/04/12 22:54:08 DEBUG NioEventLoop: -Dio.netty.noKeySetOptimization: false16/04/12 22:54:08 DEBUG NioEventLoop: -Dio.netty.selectorAutoRebuildThreshold: 51216/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numHeapArenas: 416/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numDirectArenas: 416/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.pageSize: 819216/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxOrder: 1116/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.chunkSize: 1677721616/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.tinyCacheSize: 51216/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.smallCacheSize: 25616/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.normalCacheSize: 6416/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxCachedBufferCapacity: 3276816/04/12 22:54:08 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.cacheTrimInterval: 819216/04/12 22:54:08 DEBUG ThreadLocalRandom: -Dio.netty.initialSeedUniquifier: 0x724fd652e2864d0d (took 1 ms)16/04/12 22:54:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type: unpooled16/04/12 22:54:08 DEBUG ByteBufUtil: -Dio.netty.threadLocalDirectBufferSize: 6553616/04/12 22:54:08 DEBUG NetUtil: Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%1)16/04/12 22:54:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 12816/04/12 22:54:08 DEBUG TransportServer: Shuffle server started on port :3402216/04/12 22:54:08 INFO Utils: Successfully started service 'sparkDriver' on port 34022.16/04/12 22:54:08 DEBUG AkkaUtils: In createActorSystem, requireCookie is: off16/04/12 22:54:08 INFO Slf4jLogger: Slf4jLogger started16/04/12 22:54:08 INFO Remoting: Starting remoting16/04/12 22:54:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.24.100.194:49829]16/04/12 22:54:09 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 49829.16/04/12 22:54:09 DEBUG SparkEnv: Using serializer: class org.apache.spark.serializer.JavaSerializer16/04/12 22:54:09 INFO SparkEnv: Registering MapOutputTracker16/04/12 22:54:09 INFO SparkEnv: Registering BlockManagerMaster16/04/12 22:54:09 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-9421ede1-5bc9-4170-9030-359d7e6a368616/04/12 22:54:09 INFO MemoryStore: MemoryStore started with capacity 143.6 MB16/04/12 22:54:09 INFO SparkEnv: Registering OutputCommitCoordinator16/04/12 22:54:09 INFO Utils: Successfully started service 'SparkUI' on port 4040.16/04/12 22:54:09 INFO SparkUI: Started SparkUI at http://10.24.100.194:404016/04/12 22:54:09 INFO HttpFileServer: HTTP File server directory is /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf/httpd-d86e703b-a499-49a6-a2df-0cca5628bdf616/04/12 22:54:09 INFO HttpServer: Starting HTTP Server16/04/12 22:54:09 DEBUG HttpServer: HttpServer is not using security16/04/12 22:54:09 INFO Utils: Successfully started service 'HTTP file server' on port 35075.16/04/12 22:54:09 DEBUG HttpFileServer: HTTP file server started at: http://10.24.100.194:3507516/04/12 22:54:10 INFO SparkContext: Added JAR file:/downloads/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar at http://10.247.10.194:35075/jars/spark-examples-1.6.1-hadoop2.6.0.jar with timestamp 146050165009016/04/12 22:54:10 INFO Executor: Starting executor ID driver on host localhost16/04/12 22:54:10 DEBUG TransportServer: Shuffle server started on port :4278216/04/12 22:54:10 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42782.16/04/12 22:54:10 INFO NettyBlockTransferService: Server created on 4278216/04/12 22:54:10 INFO BlockManagerMaster: Trying to register BlockManager16/04/12 22:54:10 INFO BlockManagerMasterEndpoint: Registering block manager localhost:42782 with 143.6 MB RAM, BlockManagerId(driver, localhost, 42782)16/04/12 22:54:10 INFO BlockManagerMaster: Registered BlockManager16/04/12 22:54:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 87.3 KB, free 87.3 KB)16/04/12 22:54:10 DEBUG BlockManager: Put block broadcast_0 locally took 113 ms16/04/12 22:54:10 DEBUG BlockManager: Putting block broadcast_0 without replication took 114 ms16/04/12 22:54:10 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.4 KB, free 101.8 KB)16/04/12 22:54:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:42782 (size: 14.4 KB, free: 143.6 MB)16/04/12 22:54:10 DEBUG BlockManagerMaster: Updated info of block broadcast_0_piece016/04/12 22:54:10 DEBUG BlockManager: Told master about block broadcast_0_piece016/04/12 22:54:10 DEBUG BlockManager: Put block broadcast_0_piece0 locally took 5 ms16/04/12 22:54:10 DEBUG BlockManager: Putting block broadcast_0_piece0 without replication took 6 ms16/04/12 22:54:10 INFO SparkContext: Created broadcast 0 from textFile at HdfsTest.scala:3416/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33) +++16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 216/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.serialVersionUID16/04/12 22:54:10 DEBUG ClosureCleaner: private final org.apache.spark.SparkContext$$anonfun$hadoopFile$1 org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.$outer16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 216/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(java.lang.Object)16/04/12 22:54:10 DEBUG ClosureCleaner: public final void org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(org.apache.hadoop.mapred.JobConf)16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 216/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext$$anonfun$hadoopFile$116/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 216/04/12 22:54:10 DEBUG ClosureCleaner: <function0>16/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext@b6b426216/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 216/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext$$anonfun$hadoopFile$1,Set(path$6))16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext,Set())16/04/12 22:54:10 DEBUG ClosureCleaner: + outermost object is not a closure, so do not clone it: (class org.apache.spark.SparkContext,org.apache.spark.SparkContext@b6b4262)16/04/12 22:54:10 DEBUG ClosureCleaner: + cloning the object <function0> of class org.apache.spark.SparkContext$$anonfun$hadoopFile$116/04/12 22:54:10 DEBUG ClosureCleaner: + cleaning cloned closure <function0> recursively (org.apache.spark.SparkContext$$anonfun$hadoopFile$1)16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function0> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1) +++16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 716/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$hadoopFile$1.serialVersionUID16/04/12 22:54:10 DEBUG ClosureCleaner: private final org.apache.spark.SparkContext org.apache.spark.SparkContext$$anonfun$hadoopFile$1.$outer16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.String org.apache.spark.SparkContext$$anonfun$hadoopFile$1.path$616/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.inputFormatClass$116/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.keyClass$116/04/12 22:54:10 DEBUG ClosureCleaner: private final java.lang.Class org.apache.spark.SparkContext$$anonfun$hadoopFile$1.valueClass$116/04/12 22:54:10 DEBUG ClosureCleaner: private final int org.apache.spark.SparkContext$$anonfun$hadoopFile$1.minPartitions$316/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 216/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply()16/04/12 22:54:10 DEBUG ClosureCleaner: public final org.apache.spark.rdd.HadoopRDD org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply()16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 116/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$3316/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 116/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext16/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 116/04/12 22:54:10 DEBUG ClosureCleaner: org.apache.spark.SparkContext@b6b426216/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 216/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext$$anonfun$hadoopFile$1,Set(path$6))16/04/12 22:54:10 DEBUG ClosureCleaner: (class org.apache.spark.SparkContext,Set())16/04/12 22:54:10 DEBUG ClosureCleaner: + outermost object is not a closure, so do not clone it: (class org.apache.spark.SparkContext,org.apache.spark.SparkContext@b6b4262)16/04/12 22:54:10 DEBUG ClosureCleaner: + the starting closure doesn't actually need org.apache.spark.SparkContext@b6b4262, so we null it out16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function0> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1) is now cleaned +++16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33) is now cleaned +++16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9) +++16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 116/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.serialVersionUID16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 216/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.apply(java.lang.Object)16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.String org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9.apply(scala.Tuple2)16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 016/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 016/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects!16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.SparkContext$$anonfun$textFile$1$$anonfun$apply$9) is now cleaned +++16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$1) +++16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 116/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.examples.HdfsTest$$anonfun$1.serialVersionUID16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 216/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.examples.HdfsTest$$anonfun$1.apply(java.lang.Object)16/04/12 22:54:10 DEBUG ClosureCleaner: public final int org.apache.spark.examples.HdfsTest$$anonfun$1.apply(java.lang.String)16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 016/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 016/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects!16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$1) is now cleaned +++16/04/12 22:54:10 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1) +++16/04/12 22:54:10 DEBUG ClosureCleaner: + declared fields: 116/04/12 22:54:10 DEBUG ClosureCleaner: public static final long org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.serialVersionUID16/04/12 22:54:10 DEBUG ClosureCleaner: + declared methods: 316/04/12 22:54:10 DEBUG ClosureCleaner: public void org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(int)16/04/12 22:54:10 DEBUG ClosureCleaner: public final void org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(int)16/04/12 22:54:10 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(java.lang.Object)16/04/12 22:54:10 DEBUG ClosureCleaner: + inner classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer classes: 016/04/12 22:54:10 DEBUG ClosureCleaner: + outer objects: 016/04/12 22:54:10 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure16/04/12 22:54:10 DEBUG ClosureCleaner: + fields accessed by starting closure: 016/04/12 22:54:10 DEBUG ClosureCleaner: + there are no enclosing objects!16/04/12 22:54:10 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.examples.HdfsTest$$anonfun$main$1$$anonfun$apply$mcVI$sp$1) is now cleaned +++16/04/12 22:54:10 DEBUG BlockManager: Getting local block broadcast_016/04/12 22:54:10 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)16/04/12 22:54:10 DEBUG BlockManager: Getting block broadcast_0 from memory16/04/12 22:54:10 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use16/04/12 22:54:11 DEBUG : address: lglop194/10.24.100.194 isLoopbackAddress: false, with host 10.24.100.194 lglop19416/04/12 22:54:11 DEBUG NativeLibraryLoader: -Dio.netty.tmpdir: /tmp (java.io.tmpdir)16/04/12 22:54:11 DEBUG NativeLibraryLoader: -Dio.netty.netty.workdir: /tmp (io.netty.tmpdir)16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = true16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false16/04/12 22:54:11 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket16/04/12 22:54:11 DEBUG DFSClient: No KeyProvider found.16/04/12 22:54:11 DEBUG RetryUtils: multipleLinearRandomRetry = null16/04/12 22:54:11 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@8001a9416/04/12 22:54:11 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@12ffa30e16/04/12 22:54:11 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.16/04/12 22:54:11 DEBUG DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protectionException in thread "main" java.io.IOException: Can't get Master Kerberos principal for use as renewer at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:205) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912) at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.foreach(RDD.scala:910) at org.apache.spark.examples.HdfsTest$$anonfun$main$1.apply$mcVI$sp(HdfsTest.scala:38) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.examples.HdfsTest$.main(HdfsTest.scala:36) at org.apache.spark.examples.HdfsTest.main(HdfsTest.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)16/04/12 22:54:11 INFO SparkContext: Invoking stop() from shutdown hook16/04/12 22:54:11 INFO SparkUI: Stopped Spark web UI at http://10.24.100.194:404016/04/12 22:54:11 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!16/04/12 22:54:11 INFO MemoryStore: MemoryStore cleared16/04/12 22:54:11 INFO BlockManager: BlockManager stopped16/04/12 22:54:11 INFO BlockManagerMaster: BlockManagerMaster stopped16/04/12 22:54:11 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!16/04/12 22:54:11 INFO SparkContext: Successfully stopped SparkContext16/04/12 22:54:11 INFO ShutdownHookManager: Shutdown hook called16/04/12 22:54:11 INFO ShutdownHookManager: Deleting directory /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf16/04/12 22:54:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.16/04/12 22:54:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.16/04/12 22:54:11 INFO ShutdownHookManager: Deleting directory /tmp/spark-cada1114-4508-4770-834a-50a40d00bdbf/httpd-d86e703b-a499-49a6-a2df-0cca5628bdf616/04/12 22:54:11 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client@12ffa30e16/04/12 22:54:11 DEBUG Client: removing client from cache: org.apache.hadoop.ipc.Client@12ffa30e16/04/12 22:54:11 DEBUG Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@12ffa30e16/04/12 22:54:11 DEBUG Client: Stopping client