In Apache Storm some users reported the same issue few months ago
[1][2][3]. This was an unusual situation which in our experience only
happened when storm topology was asking for offsets that were already
trimmed by kafka. Multiple pathological cases(too low retention period,
too slow topology, some poison pill message that kept retrying and finally
got trimmed by kafka and was no longer available) because of which users
may end up in that situation. Storm to an extent allows user to control
what they want to do in this situation, not so sure about spark streaming.

If spark does not handle this right now your best bet is to ensure your
kafka retention period is high enough that your processing does not fall
behind so much that the data gets trimmed (and this you should anyway do
to avoid data loss) and ensure that you throw away/push to DLQ, failed
messages after some small number of retries instead of retrying forever.

[1]https://issues.apache.org/jira/browse/STORM-511
[2]https://issues.apache.org/jira/browse/STORM-586

[3]https://issues.apache.org/jira/browse/STORM-643

Thanks
Parth


On 8/6/15, 12:09 PM, "Grant Henke" <ghe...@cloudera.com> wrote:

>Does this Spark Jira match up with what you are seeing or sound related?
>https://issues.apache.org/jira/browse/SPARK-8474
>
>What versions of Spark and Kafka are you using? Can you include more of
>the
>spark log? Any errors shown in the Kafka log?
>
>Thanks,
>Grant
>
>On Thu, Aug 6, 2015 at 1:17 PM, Cassa L <lcas...@gmail.com> wrote:
>
>> Hi,
>>  Has anyone tried streaming API of Spark with Kafka? I am experimenting
>>new
>> Spark API to read from Kafka.
>> KafkaUtils.createDirectStream(...)
>>
>> Every now and then, I get following error "spark
>> kafka.common.OffsetOutOfRangeException" and my spark script stops
>>working.
>> I have simple topic with just one partition.
>>
>> I would appreciate any clues on how to debug this issue.
>>
>> Thanks,
>> LCassa
>>
>
>
>
>-- 
>Grant Henke
>Software Engineer | Cloudera
>gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke

Reply via email to