I am not sure which other log file to look into. I have one master and one worker and in the previous mail I showed the hive and spark worker’s log. The master log contains something like the following (extracted from an execution I just did)
15/12/24 18:19:01 INFO master.Master: Launching executor app-20151224181901-0000/0 on worker worker-20151224181835-192.168.1.64-35198 15/12/24 18:19:06 INFO master.Master: Received unregister request from application app-20151224181901-0000 15/12/24 18:19:06 INFO master.Master: Removing app app-20151224181901-0000 15/12/24 18:19:06 WARN master.Master: Got status update for unknown executor app-20151224181901-0000/0 I run hive with spark execution engine and I have certainly set the correct master in the hive-site.xml. A normal Spark job seems to run ok, I just ran the wordcount example and it terminates just fine. Also, the table is created by something like 'CREATE TABLE userstweetsdailystatistics(foo INT, bar STRING);’ The error log file not being accessed I think is due to the fact that the executor is already killed by the time it tries to write in it (stream closed). > On 24 Dec 2015, at 17:52, Jörn Franke <jornfra...@gmail.com> wrote: > > Have you checked what the issue is with the log file causing troubles? Enough > space available? Access rights (what is the user of the spark worker?)? Does > directory exist? > > Can you provide more details how the table is created? > > Does the query work with mr or tez as an execution engine? > > Does a normal Spark job without Hive work? > > On 24 Dec 2015, at 17:25, Sofia <sofia.panagiot...@taiger.com > <mailto:sofia.panagiot...@taiger.com>> wrote: > >> Hello and happy holiday to those who are already enjoying it! >> >> >> I am still having trouble running Hive with Spark. I downloaded Spark 1.5.2 >> and built it like this (my Hadoop is version 2.7.1): >> >> ./make-distribution.sh --name "hadoop2-without-hive" --tgz >> "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided” >> >> When trying to run it with Hive 1.2.1 (a simple command that creates a Spark >> job like ‘Select count(*) from userstweetsdailystatistics;') get the >> following error >> >> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkBuildPlan >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Map 1 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:54 INFO Configuration.deprecation: mapred.task.is.map is deprecated. >> Instead, use mapreduce.task.ismap >> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:54 INFO exec.Utilities: Processing alias userstweetsdailystatistics >> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:54 INFO exec.Utilities: Adding input file >> hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics >> >> <hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics> >> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:55 INFO log.PerfLogger: <PERFLOG method=serializePlan >> from=org.apache.hadoop.hive.ql.exec.Utilities> >> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:55 INFO exec.Utilities: Serializing MapWork via kryo >> 15/12/24 17:12:56 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:56 INFO log.PerfLogger: </PERFLOG method=serializePlan >> start=1450973575887 end=1450973576279 duration=392 >> from=org.apache.hadoop.hive.ql.exec.Utilities> >> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:57 INFO storage.MemoryStore: ensureFreeSpace(572800) called with >> curMem=0, maxMem=556038881 >> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:57 INFO storage.MemoryStore: Block broadcast_0 stored as values in >> memory (estimated size 559.4 KB, free 529.7 MB) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO storage.MemoryStore: ensureFreeSpace(43075) called with >> curMem=572800, maxMem=556038881 >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes >> in memory (estimated size 42.1 KB, free 529.7 MB) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory >> on 192.168.1.64:49690 (size: 42.1 KB, free: 530.2 MB) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 ERROR util.Utils: uncaught error in thread SparkListenerBus, >> stopping SparkContext >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: >> java.lang.AbstractMethodError >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/metrics/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage/kill,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/api,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/static,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/threadDump,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/environment/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/environment,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/rdd/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/rdd,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/pool/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/pool,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/job/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/job,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/json,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs,null} >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO spark.SparkContext: Created broadcast 0 from hadoopRDD at >> SparkPlanGenerator.java:188 >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.64:4040 >> <http://192.168.1.64:4040/> >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO scheduler.DAGScheduler: Stopping DAGScheduler >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO cluster.SparkDeploySchedulerBackend: Shutting down all >> executors >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO cluster.SparkDeploySchedulerBackend: Asking each executor to >> shut down >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Map 1 >> start=1450973574712 end=1450973578874 duration=4162 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Reducer 2 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO log.PerfLogger: <PERFLOG method=serializePlan >> from=org.apache.hadoop.hive.ql.exec.Utilities> >> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:58 INFO exec.Utilities: Serializing ReduceWork via kryo >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO log.PerfLogger: </PERFLOG method=serializePlan >> start=1450973578926 end=1450973579000 duration=74 >> from=org.apache.hadoop.hive.ql.exec.Utilities> >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Reducer 2 >> start=1450973578874 end=1450973579073 duration=199 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildPlan >> start=1450973574707 end=1450973579074 duration=4367 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator> >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO log.PerfLogger: <PERFLOG method=SparkBuildRDDGraph >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan> >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 WARN remote.ReliableDeliverySupervisor: Association with remote >> system [akka.tcp://sparkExecutor@192.168.1.64:35089 >> <akka.tcp://sparkExecutor@192.168.1.64:35089>] has failed, address is now >> gated for [5000] ms. Reason: [Disassociated] >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildRDDGraph >> start=1450973579074 end=1450973579273 duration=199 >> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan> >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 >> 17:12:59 INFO client.RemoteDriver: Failed to run job >> d3746d11-eac8-4bf9-9897-bef27fd0423e >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: >> java.lang.IllegalStateException: Cannot call methods on a stopped >> SparkContext >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.SparkContext.org >> <http://org.apache.spark.sparkcontext.org/>$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.SparkContext.submitJob(SparkContext.scala:1981) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:118) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:116) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.RDD.withScope(RDD.scala:310) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.rdd.AsyncRDDActions.foreachAsync(AsyncRDDActions.scala:116) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.api.java.JavaRDDLike$class.foreachAsync(JavaRDDLike.scala:690) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.spark.api.java.AbstractJavaRDDLike.foreachAsync(JavaRDDLike.scala:47) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:257) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> java.util.concurrent.FutureTask.run(FutureTask.java:262) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: at >> java.lang.Thread.run(Thread.java:745) >> 15/12/24 17:12:59 [RPC-Handler-3]: INFO client.SparkClientImpl: Received >> result for d3746d11-eac8-4bf9-9897-bef27fd0423e >> Status: Failed >> 15/12/24 17:12:59 [Thread-8]: ERROR status.SparkJobMonitor: Status: Failed >> 15/12/24 17:12:59 [Thread-8]: INFO log.PerfLogger: </PERFLOG >> method=SparkRunJob start=1450973569576 end=1450973579584 duration=10008 >> from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor> >> FAILED: Execution Error, return code 3 from >> org.apache.hadoop.hive.ql.exec.spark.SparkTask >> 15/12/24 17:13:01 [main]: ERROR ql.Driver: FAILED: Execution Error, return >> code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask >> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG >> method=Driver.execute start=1450973565261 end=1450973581307 duration=16046 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks >> from=org.apache.hadoop.hive.ql.Driver> >> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks >> start=1450973581308 end=1450973581308 duration=0 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 finished. closing... >> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 Close done >> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks >> from=org.apache.hadoop.hive.ql.Driver> >> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks >> start=1450973581362 end=1450973581362 duration=0 >> from=org.apache.hadoop.hive.ql.Driver> >> >> >> The only useful thing I can find at the Spark side is in the worker log: >> >> 15/12/24 17:12:53 INFO worker.Worker: Asked to launch executor >> app-20151224171253-0000/0 for Hive on Spark >> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing view acls to: ubuntu >> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing modify acls to: ubuntu >> 15/12/24 17:12:53 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(ubuntu); users with modify permissions: Set(ubuntu) >> 15/12/24 17:12:53 INFO worker.ExecutorRunner: Launch command: >> "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp" >> "/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/sbin/../conf/:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar" >> "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=44858" >> "-Dhive.spark.log.dir=/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/logs/" >> "-XX:MaxPermSize=256m" >> "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" >> "akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler >> <akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler>" >> "--executor-id" "0" "--hostname" "192.168.1.64" "--cores" "3" "--app-id" >> "app-20151224171253-0000" "--worker-url" >> "akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker >> <akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker>" >> 15/12/24 17:12:58 INFO worker.Worker: Asked to kill executor >> app-20151224171253-0000/0 >> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Runner thread for executor >> app-20151224171253-0000/0 interrupted >> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Killing process! >> 15/12/24 17:12:58 ERROR logging.FileAppender: Error writing stream to file >> /home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/work/app-20151224171253-0000/0/stderr >> java.io.IOException: Stream closed >> at >> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) >> at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) >> at java.io.BufferedInputStream.read(BufferedInputStream.java:334) >> at java.io.FilterInputStream.read(FilterInputStream.java:107) >> at >> org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) >> at >> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) >> at >> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) >> at >> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) >> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >> at >> org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) >> 15/12/24 17:12:59 INFO worker.Worker: Executor app-20151224171253-0000/0 >> finished with state KILLED exitStatus 143 >> 15/12/24 17:12:59 INFO worker.Worker: Cleaning up local directories for >> application app-20151224171253-0000 >> 15/12/24 17:12:59 INFO shuffle.ExternalShuffleBlockResolver: Application >> app-20151224171253-0000 removed, cleanupLocalDirs = true >> >> Here is my Spark configuration >> >> export HADOOP_HOME=/usr/local/hadoop >> export PATH=$PATH:$HADOOP_HOME/bin >> export SPARK_DIST_CLASSPATH=`hadoop class path` >> >> >> Any hints as to what could be going wrong? Why is the executor getting >> killed? Have I built Spark wrongly? I have tried building it in several >> different ways and I keep failing. >> I must admit I am confused with the information I find online on how to >> use/build Spark on Hive and which version goes with what. >> Can I download a pre-built version of Spark that would be suitable with my >> existing Hadoop 2.7.1 and my Hive 1.2.1? >> This error has been baffling me for weeks.. >> >> >> More than grateful for any help! >> Sofia >> >>