I also created a JIRA for task failures
https://issues.apache.org/jira/browse/SPARK-12452

On Mon, Dec 21, 2015 at 9:54 AM, Neelesh <neele...@gmail.com> wrote:

> I am leaning towards something like that. Things get interesting when
> multiple different transformations and regrouping happen. At the end of it
> all, when the "task" is done, we no longer are sure which kafka partition
> they came from, even when all transforms/ grouping happen local to the
> original partition.
>
> Registering an onTaskCompleted listener on my custom RDDIterator seemed
> like a good choice, since the Iterator is the only one that really knows
> about the kafka offsets for that partition. Granted, the driver has a list
> of them too, but there is no way of mapping the task that failed/succeeded
> to the list offsets (HasOffsetRanges) on the driver.
>
> All of this may seem counterintuitive to how the streaming jobs are
> generated (in memory increments to offsets). One of my requirements is to
> not generate an RDD, if the previous one did not complete - because the
> business logic cannot go forward in the face of errors, and has to stop
> consuming from the kafka partition that caused failures. This is easy
> enough to implement because I have my own DStream/RDD/Iterator
> implementations - basically from DirectKafkaInputDStream et al, and
> modified to support dynamically adding/removing topics.   Now the issue has
> moved to how can I update consumer offsets for a partition when a task
> succeeds.
>
> Thanks!
>
>
>
> On Mon, Dec 21, 2015 at 8:24 AM, Cody Koeninger <c...@koeninger.org>
> wrote:
>
>> Honestly it's a lot easier to deal with this using transactions.
>>
>> Someone else would have to speak to the possibility of getting task
>> failures added to listener callbacks.
>>
>> On Sat, Dec 19, 2015 at 5:44 PM, Neelesh <neele...@gmail.com> wrote:
>>
>>> Hi,
>>>   I'm trying to build automatic Kafka watermark handling in my stream
>>> apps by overriding the KafkaRDDIterator, and adding a
>>> taskcompletionlistener and updating watermarks if task was completed (the
>>> iterator has access to offsets). But I found out that there is no way to
>>> listen to a task error inside the executors. Only the driver gets the
>>> TaskReason notification, but not the task completion listener. This
>>> basically means watermarks will be updated regardless of whether the task
>>> completed successfully or not.
>>>
>>> Is there any way to listen for the task failures on the executors?
>>>
>>> Thanks!
>>> -neelesh
>>>
>>
>>
>

Reply via email to