subject:"Recovery for Spark Streaming Kafka Direct with OffsetOutOfRangeException"

Re: Recovery for Spark Streaming Kafka Direct with OffsetOutOfRangeException

2016-01-21 Thread Cody Koeninger

Looks like this response did go to the list. As far as OffsetOutOfRange goes, right now that's an unrecoverable error, because it breaks the underlying invariants (e.g. that the number of messages in a partition is deterministic once the RDD is defined) If you want to do some hacking for your own

Recovery for Spark Streaming Kafka Direct with OffsetOutOfRangeException

2016-01-21 Thread Dan Dutrow

Hey Cody, I would have responded to the mailing list but it looks like this thread got aged off. I have the problem where one of my topics dumps more data than my spark job can keep up with. We limit the input rate with maxRatePerPartition Eventually, when the data is aged off, I get the OffsetOutO