Re: Adding / Removing worker nodes for Spark Streaming

2015-09-29 Thread Adrian Tanase
part of the checkpointed metadata in the spark context. -adrian From: Cody Koeninger Date: Tuesday, September 29, 2015 at 12:49 AM To: Sourabh Chandak Cc: Augustus Hong, "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Re: Adding / Removing worker nodes for Spark

Re: Adding / Removing worker nodes for Spark Streaming

2015-09-28 Thread Cody Koeninger
If a node fails, the partition / offset range that it was working on will be scheduled to run on another node. This is generally true of spark, regardless of checkpointing. The offset ranges for a given batch are stored in the checkpoint for that batch. That's relevant if your entire job fails (

Re: Adding / Removing worker nodes for Spark Streaming

2015-09-28 Thread Sourabh Chandak
I also have the same use case as Augustus, and have some basic questions about recovery from checkpoint. I have a 10 node Kafka cluster and a 30 node Spark cluster running streaming job, how is the (topic, partition) data handled in checkpointing. The scenario I want to understand is, in case of no

Re: Adding / Removing worker nodes for Spark Streaming

2015-09-28 Thread Augustus Hong
Got it, thank you! On Mon, Sep 28, 2015 at 11:37 AM, Cody Koeninger wrote: > Losing worker nodes without stopping is definitely possible. I haven't > had much success adding workers to a running job, but I also haven't spent > much time on it. > > If you're restarting with the same jar, you sh

Re: Adding / Removing worker nodes for Spark Streaming

2015-09-28 Thread Cody Koeninger
Losing worker nodes without stopping is definitely possible. I haven't had much success adding workers to a running job, but I also haven't spent much time on it. If you're restarting with the same jar, you should be able to recover from checkpoint without losing data (usual caveats apply, e.g. y