Do the cast to HasOffsetRanges before calling any other methods on the
direct stream. This is covered in the documentation:
http://spark.apache.org/docs/latest/streaming-kafka-integration.html
If you want to use fromOffsets, you can also just grab the highest
available offsets from Kafka and pro
Also Is in fromoffset api last saved offset is fetched twice ? Is
fromoffset api starts from Map's
Long value or LongValue+1 ? If its from Longvalue - it will be twice - once
it was in last application's run before crash and once after crash in first
run ?
On Thu, Aug 6, 2015 at 9:05 AM, Shushant
Hi
For checkpointing and using fromOffsets arguments- Say for the first time
when my app starts I don't have any prev state stored and I want to start
consuming from largest offset
1. is it possible to specify that in fromOffsets api- I don't want to use
another api which returs JavaPairInputDS
You can't use checkpoints across code upgrades. That may or may not change
in the future, but for now that's a limitation of spark checkpoints
(regardless of whether you're using Kafka).
Some options:
- Start up the new job on a different cluster, then kill the old job once
it's caught up to whe
Hi,
I've read about the recent updates about spark-streaming integration with
Kafka (I refer to the new approach without receivers).
In the new approach, metadata are persisted in checkpoint folders on HDFS
so that the SparkStreaming context can be recreated in case of failures.
This means that the