Hello, I have a spark streaming job using hdfs and checkpointing components and running well on a standalone spark cluster with multi nodes, both in client and cluster deploy mode. I would like to switch with Mesos cluster manager and submit job as cluster deploy mode.
First launch of the app is working well wheareas second launch (after kill) implying checkpoint recovery failed as : _______ java.lang.RuntimeException: Stream '/jars/application.jar' was not found. at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222) ... _______ This error occurs because the Driver that is in charge of exposing application jar to the executors, is trying to expose it from the jar path stored by the checkpoint (loaded from hdfs and stored in mesos workdir path = sandbox) that does not exist in the current node. I'm confused by the dispatcher beehaviour. It's seems that there are functional gaps between checkpoint retrieving in spark streaming and the sandbox machinerie used by mesos cluster. * 1. Why spark is using a rpc interface to expose application jar to executors when using hdfs, instead of executors are loading directly from source ? 2. How to fix this issue (if possible ?) * Versions : Mesos 1.2.0 spark 2.0.1 hdfs 2.7 More information, see stackoverflow issue here <https://stackoverflow.com/questions/44703631/spark-streaming-cluster-mode-in-mesos-java-lang-runtimeexception-stream-jar-n> . Thanks, RCinna -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-Mesos-cluster-mode-java-lang-RuntimeException-Stream-jar-not-found-tp29001.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
