Re: Executor getting killed when running Hive on Spark

Sofia Thu, 24 Dec 2015 10:00:05 -0800

I am not sure which other log file to look into. 
I have one master and one worker and in the previous mail I showed the hive and 
spark worker’s log. The master log contains something like the following 
(extracted from an execution I just did)


15/12/24 18:19:01 INFO master.Master: Launching executor 
app-20151224181901-0000/0 on worker worker-20151224181835-192.168.1.64-35198
15/12/24 18:19:06 INFO master.Master: Received unregister request from 
application app-20151224181901-0000
15/12/24 18:19:06 INFO master.Master: Removing app app-20151224181901-0000
15/12/24 18:19:06 WARN master.Master: Got status update for unknown executor 
app-20151224181901-0000/0

I run hive with spark execution engine and I have certainly set the correct 
master in the hive-site.xml.
A normal Spark job seems to run ok, I just ran the wordcount example and it 
terminates just fine.

Also, the table is created by something like 'CREATE TABLE 
userstweetsdailystatistics(foo INT, bar STRING);’

The error log file not being accessed I think is due to the fact that the 
executor is already killed by the time it tries to write in it (stream closed).


> On 24 Dec 2015, at 17:52, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> Have you checked what the issue is with the log file causing troubles? Enough 
> space available? Access rights (what is the user of the spark worker?)? Does 
> directory exist?
> 
> Can you provide more details how the table is created?
> 
> Does the query work with mr or tez as an execution engine?
> 
> Does a normal Spark job without Hive work?
> 
> On 24 Dec 2015, at 17:25, Sofia <sofia.panagiot...@taiger.com 
> <mailto:sofia.panagiot...@taiger.com>> wrote:
> 
>> Hello and happy holiday to those who are already enjoying it!
>> 
>> 
>> I am still having trouble running Hive with Spark. I downloaded Spark 1.5.2 
>> and built it like this (my Hadoop is version 2.7.1):
>> 
>> ./make-distribution.sh --name "hadoop2-without-hive" --tgz 
>> "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided”
>> 
>> When trying to run it with Hive 1.2.1 (a simple command that creates a Spark 
>> job like ‘Select count(*) from userstweetsdailystatistics;') get the 
>> following error
>> 
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkBuildPlan 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Map 1 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:54 INFO Configuration.deprecation: mapred.task.is.map is deprecated. 
>> Instead, use mapreduce.task.ismap
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:54 INFO exec.Utilities: Processing alias userstweetsdailystatistics
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:54 INFO exec.Utilities: Adding input file 
>> hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics
>>  
>> <hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics>
>> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:55 INFO log.PerfLogger: <PERFLOG method=serializePlan 
>> from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:55 INFO exec.Utilities: Serializing MapWork via kryo
>> 15/12/24 17:12:56 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:56 INFO log.PerfLogger: </PERFLOG method=serializePlan 
>> start=1450973575887 end=1450973576279 duration=392 
>> from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:57 INFO storage.MemoryStore: ensureFreeSpace(572800) called with 
>> curMem=0, maxMem=556038881
>> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:57 INFO storage.MemoryStore: Block broadcast_0 stored as values in 
>> memory (estimated size 559.4 KB, free 529.7 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO storage.MemoryStore: ensureFreeSpace(43075) called with 
>> curMem=572800, maxMem=556038881
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes 
>> in memory (estimated size 42.1 KB, free 529.7 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory 
>> on 192.168.1.64:49690 (size: 42.1 KB, free: 530.2 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 ERROR util.Utils: uncaught error in thread SparkListenerBus, 
>> stopping SparkContext
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 
>> java.lang.AbstractMethodError
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/metrics/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/api,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/static,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/threadDump,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/environment/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/environment,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/rdd,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/pool/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/pool,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/job/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/job,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO spark.SparkContext: Created broadcast 0 from hadoopRDD at 
>> SparkPlanGenerator.java:188
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.64:4040 
>> <http://192.168.1.64:4040/>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO scheduler.DAGScheduler: Stopping DAGScheduler
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO cluster.SparkDeploySchedulerBackend: Shutting down all 
>> executors
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO cluster.SparkDeploySchedulerBackend: Asking each executor to 
>> shut down
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Map 1 
>> start=1450973574712 end=1450973578874 duration=4162 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Reducer 2 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO log.PerfLogger: <PERFLOG method=serializePlan 
>> from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:58 INFO exec.Utilities: Serializing ReduceWork via kryo
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO log.PerfLogger: </PERFLOG method=serializePlan 
>> start=1450973578926 end=1450973579000 duration=74 
>> from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Reducer 2 
>> start=1450973578874 end=1450973579073 duration=199 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildPlan 
>> start=1450973574707 end=1450973579074 duration=4367 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO log.PerfLogger: <PERFLOG method=SparkBuildRDDGraph 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 WARN remote.ReliableDeliverySupervisor: Association with remote 
>> system [akka.tcp://sparkExecutor@192.168.1.64:35089 
>> <akka.tcp://sparkExecutor@192.168.1.64:35089>] has failed, address is now 
>> gated for [5000] ms. Reason: [Disassociated] 
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildRDDGraph 
>> start=1450973579074 end=1450973579273 duration=199 
>> from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
>> 17:12:59 INFO client.RemoteDriver: Failed to run job 
>> d3746d11-eac8-4bf9-9897-bef27fd0423e
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 
>> java.lang.IllegalStateException: Cannot call methods on a stopped 
>> SparkContext
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.SparkContext.org 
>> <http://org.apache.spark.sparkcontext.org/>$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.SparkContext.submitJob(SparkContext.scala:1981)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:118)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:116)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.rdd.AsyncRDDActions.foreachAsync(AsyncRDDActions.scala:116)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.api.java.JavaRDDLike$class.foreachAsync(JavaRDDLike.scala:690)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.spark.api.java.AbstractJavaRDDLike.foreachAsync(JavaRDDLike.scala:47)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:257)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:     at 
>> java.lang.Thread.run(Thread.java:745)
>> 15/12/24 17:12:59 [RPC-Handler-3]: INFO client.SparkClientImpl: Received 
>> result for d3746d11-eac8-4bf9-9897-bef27fd0423e
>> Status: Failed
>> 15/12/24 17:12:59 [Thread-8]: ERROR status.SparkJobMonitor: Status: Failed
>> 15/12/24 17:12:59 [Thread-8]: INFO log.PerfLogger: </PERFLOG 
>> method=SparkRunJob start=1450973569576 end=1450973579584 duration=10008 
>> from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
>> FAILED: Execution Error, return code 3 from 
>> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>> 15/12/24 17:13:01 [main]: ERROR ql.Driver: FAILED: Execution Error, return 
>> code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG 
>> method=Driver.execute start=1450973565261 end=1450973581307 duration=16046 
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks 
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks 
>> start=1450973581308 end=1450973581308 duration=0 
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 finished. closing... 
>> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 Close done
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks 
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks 
>> start=1450973581362 end=1450973581362 duration=0 
>> from=org.apache.hadoop.hive.ql.Driver>
>> 
>> 
>> The only useful thing I can find at the Spark side is in the worker log:
>> 
>> 15/12/24 17:12:53 INFO worker.Worker: Asked to launch executor 
>> app-20151224171253-0000/0 for Hive on Spark
>> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing view acls to: ubuntu
>> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing modify acls to: ubuntu
>> 15/12/24 17:12:53 INFO spark.SecurityManager: SecurityManager: 
>> authentication disabled; ui acls disabled; users with view permissions: 
>> Set(ubuntu); users with modify permissions: Set(ubuntu)
>> 15/12/24 17:12:53 INFO worker.ExecutorRunner: Launch command: 
>> "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp" 
>> "/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/sbin/../conf/:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar"
>>  "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=44858" 
>> "-Dhive.spark.log.dir=/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/logs/"
>>  "-XX:MaxPermSize=256m" 
>> "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
>> "akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler 
>> <akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler>" 
>> "--executor-id" "0" "--hostname" "192.168.1.64" "--cores" "3" "--app-id" 
>> "app-20151224171253-0000" "--worker-url" 
>> "akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker 
>> <akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker>"
>> 15/12/24 17:12:58 INFO worker.Worker: Asked to kill executor 
>> app-20151224171253-0000/0
>> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Runner thread for executor 
>> app-20151224171253-0000/0 interrupted
>> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Killing process!
>> 15/12/24 17:12:58 ERROR logging.FileAppender: Error writing stream to file 
>> /home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/work/app-20151224171253-0000/0/stderr
>> java.io.IOException: Stream closed
>>      at 
>> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
>>      at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
>>      at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>>      at java.io.FilterInputStream.read(FilterInputStream.java:107)
>>      at 
>> org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
>>      at 
>> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
>>      at 
>> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
>>      at 
>> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
>>      at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>>      at 
>> org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
>> 15/12/24 17:12:59 INFO worker.Worker: Executor app-20151224171253-0000/0 
>> finished with state KILLED exitStatus 143
>> 15/12/24 17:12:59 INFO worker.Worker: Cleaning up local directories for 
>> application app-20151224171253-0000
>> 15/12/24 17:12:59 INFO shuffle.ExternalShuffleBlockResolver: Application 
>> app-20151224171253-0000 removed, cleanupLocalDirs = true
>> 
>> Here is my Spark configuration
>> 
>> export HADOOP_HOME=/usr/local/hadoop
>> export PATH=$PATH:$HADOOP_HOME/bin
>> export SPARK_DIST_CLASSPATH=`hadoop class path`
>> 
>> 
>> Any hints as to what could be going wrong? Why is the executor getting 
>> killed? Have I built Spark wrongly? I have tried building it in several 
>> different ways and I keep failing.
>> I must admit I am confused with the information I find online on how to 
>> use/build Spark on Hive and which version goes with what.
>> Can I download a pre-built version of Spark that would be suitable with my 
>> existing Hadoop 2.7.1 and my Hive 1.2.1?
>> This error has been baffling me for weeks..
>> 
>> 
>> More than grateful for any help!
>> Sofia
>> 
>>

Re: Executor getting killed when running Hive on Spark

Reply via email to