- Checkpointing alone isn't enough to get exactly-once semantics.
Events will be replayed in case of failure.  You must have idempotent
output operations.

- Another way to handle upgrades is to just start a second app with
the new code, then stop the old one once everything's caught up.

On Tue, Apr 12, 2016 at 1:15 AM, Soumitra Siddharth Johri
<soumitra.siddha...@gmail.com> wrote:
> I think before doing a code update you would like to gracefully shutdown
> your streaming job and checkpoint the processed offsets ( and any state that
> you maintain ) in database or Hdfs.
> When you start the job up it should read this checkpoint file , build the
> necessary state and begin processing from the last offset processed.
>
> Another approach would be to checkpoint the processed offsets in the
> streaming job whenever you read from Kafka . Then before reading the next
> batch of offsets instead of relying on spark checkpoint for offsets, read
> from the last processed offset that you saved.
>
> Regards
> Soumitra
>
> On Apr 11, 2016, at 8:31 PM, Siva Gudavalli <gss.su...@gmail.com> wrote:
>
> Okie. That makes sense.
>
> Any recommendations on how to manage changes to my spark streaming app and
> achieving fault tolerance at the same time
>
> On Mon, Apr 11, 2016 at 8:16 PM, Shixiong(Ryan) Zhu
> <shixi...@databricks.com> wrote:
>>
>> You cannot. Streaming doesn't support it because code changes will break
>> Java serialization.
>>
>> On Mon, Apr 11, 2016 at 4:30 PM, Siva Gudavalli <gss.su...@gmail.com>
>> wrote:
>>>
>>> hello,
>>>
>>> i am writing a spark streaming application to read data from kafka. I am
>>> using no receiver approach and enabled checkpointing to make sure I am not
>>> reading messages again in case of failure. (exactly once semantics)
>>>
>>> i have a quick question how checkpointing needs to be configured to
>>> handle code changes in my spark streaming app.
>>>
>>> can you please suggest. hope the question makes sense.
>>>
>>> thank you
>>>
>>> regards
>>> shiv
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to