Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-07-04 Thread Eno Thereska
As part of the PR review we decided to add a metric to keep track of the number of skipped records due to deserialization. I updated the KIP to reflect that. Thanks Eno > On Jun 23, 2017, at 10:59 AM, Eno Thereska wrote: > > Done, thanks. I'll open a vote thread now. > > Eno >> On 23 Jun 2017

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-23 Thread Eno Thereska
Done, thanks. I'll open a vote thread now. Eno > On 23 Jun 2017, at 02:15, Matthias J. Sax wrote: > > I also think, that one config is better, with two default > implementations: failing and log-and-continue > > However, I think we should fail by default. Similar to timestamp > extractor as "si

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-22 Thread Matthias J. Sax
I also think, that one config is better, with two default implementations: failing and log-and-continue However, I think we should fail by default. Similar to timestamp extractor as "silent" data loss is no good default behavior IMHO. -Matthias On 6/22/17 11:00 AM, Eno Thereska wrote: > Answers

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-22 Thread Eno Thereska
Answers inline: > On 22 Jun 2017, at 03:26, Guozhang Wang wrote: > > Thanks for the updated KIP, some more comments: > > 1.The config name is "default.deserialization.exception.handler" while the > interface class name is "RecordExceptionHandler", which is more general > than the intended purp

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-21 Thread Guozhang Wang
Thanks for the updated KIP, some more comments: 1.The config name is "default.deserialization.exception.handler" while the interface class name is "RecordExceptionHandler", which is more general than the intended purpose. Could we rename the class name accordingly? 2. Could you describe the full

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-21 Thread Eno Thereska
Thanks Guozhang, I’ve updated the KIP and hopefully addressed all the comments so far. In the process also changed the name of the KIP to reflect its scope better: https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+deserialization+exception+handlers

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-11 Thread Guozhang Wang
Eno, Thanks for bringing this proposal up and sorry for getting late on this. Here are my two cents: 1. First some meta comments regarding "fail fast" v.s. "making progress". I agree that in general we should better "enforce user to do the right thing" in system design, but we also need to keep in

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-07 Thread Jan Filipiak
Hi Eno, On 07.06.2017 22:49, Eno Thereska wrote: Comments inline: On 5 Jun 2017, at 18:19, Jan Filipiak wrote: Hi just my few thoughts On 05.06.2017 11:44, Eno Thereska wrote: Hi there, Sorry for the late reply, I was out this past week. Looks like good progress was made with the discus

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-07 Thread Eno Thereska
Comments inline: > On 5 Jun 2017, at 18:19, Jan Filipiak wrote: > > Hi > > just my few thoughts > > On 05.06.2017 11:44, Eno Thereska wrote: >> Hi there, >> >> Sorry for the late reply, I was out this past week. Looks like good progress >> was made with the discussions either way. Let me rec

Fwd: Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-05 Thread Matthias J. Sax
Should go to dev list, too. Forwarded Message Subject: Re: [DISCUSS]: KIP-161: streams record processing exception handlers Date: Mon, 5 Jun 2017 19:19:42 +0200 From: Jan Filipiak Reply-To: us...@kafka.apache.org To: us...@kafka.apache.org Hi just my few thoughts On

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-05 Thread Eno Thereska
Hi there, Sorry for the late reply, I was out this past week. Looks like good progress was made with the discussions either way. Let me recap a couple of points I saw into one big reply: 1. Jan mentioned CRC errors. I think this is a good point. As these happen in Kafka, before Kafka Streams g

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-03 Thread Matthias J. Sax
What I don't understand is this: > From there on its the easiest way forward: fix, redeploy, start => done If you have many producers that work fine and a new "bad" producer starts up and writes bad data into your input topic, your Streams app dies but all your producers, including the bad one,

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-03 Thread Jan Filipiak
Could not agree more! But then I think the easiest is still: print exception and die. From there on its the easiest way forward: fix, redeploy, start => done All the other ways to recover a pipeline that was processing partially all the time and suddenly went over a "I cant take it anymore" thr

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Jan Filipiak
Hi 1. That greatly complicates monitoring. Fail Fast gives you that when you monitor only the lag of all your apps you are completely covered. With that sort of new application Monitoring is very much more complicated as you know need to monitor fail % of some special apps aswell. In my opini

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Damian Guy
Jan, you have a choice to Fail fast if you want. This is about giving people options and there are times when you don't want to fail fast. On Fri, 2 Jun 2017 at 11:00 Jan Filipiak wrote: > Hi > > 1. > That greatly complicates monitoring. Fail Fast gives you that when you > monitor only the lag

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Damian Guy
I agree with what Matthias has said w.r.t failing fast. There are plenty of times when you don't want to fail-fast and must attempt to make progress. The dead-letter queue is exactly for these circumstances. Of course if every record is failing, then you probably do want to give up. On Fri, 2 Jun

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-01 Thread Matthias J. Sax
First a meta comment. KIP discussion should take place on the dev list -- if user list is cc'ed please make sure to reply to both lists. Thanks. Thanks for making the scope of the KIP clear. Makes a lot of sense to focus on deserialization exceptions for now. With regard to corrupted state stores

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-29 Thread Jay Kreps
Hey Eno, I think this makes sense. I do think people who spend time running production stream processing systems will, over time, end up strongly preferring the current behavior of failing and fixing the root problem rather than skipping, but we don't need to force this on people as long as the de

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Damian Guy
In that case, though, every access to that key is doomed to failure as the database is corrupted. So i think it should probably die in a steaming heap at that point! On Fri, 26 May 2017 at 17:33 Eno Thereska wrote: > Hi Damian, > > I was thinking of cases when there is bit-rot on the storage its

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Eno Thereska
Hi Damian, I was thinking of cases when there is bit-rot on the storage itself and we get a malformed record that cannot be de-serialized. There is an interesting intersection here with CRCs in both Kafka (already there, they throw on deserialization) and potentially local storage (we don't hav

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Damian Guy
Eno, Under what circumstances would you get a deserialization exception from the state store? I can only think of the case where someone has provided a bad deserializer to a method that creates a state store. In which case it would be a user error and probably should just abort? Thanks, Damian O

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Eno Thereska
See latest reply to Jan's note. I think I unnecessarily broadened the scope of this KIP to the point where it sounded like it handles all sorts of exceptions. The scope should be strictly limited to "poison pill" records for now. Will update KIP, Thanks Eno > On 26 May 2017, at 16:16, Matthias

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Eno Thereska
Hi Jan, You're right. I think I got carried away and broadened the scope of this KIP beyond it's original purpose. This handler will only be there for deserialization errors, i.e., "poison pills" and is not intended to be a catch-all handler for all sorts of other problems (e.g., NPE exception

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Matthias J. Sax
"bad" for this case would mean, that we got an `DeserializationException`. I am not sure if any other processing error should be covered? @Eno: this raises one one question. Might it be better to allow for two handlers instead of one? One for deserialization exception and one for all other excepti

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Jim Jagielski
> On May 26, 2017, at 5:13 AM, Eno Thereska wrote: > > >> >> >> With regard to `DeserializationException`, do you thing it might make >> sense to have a "dead letter queue" as a feature to provide out-of-the-box? > > We could provide a special topic where bad messages go to, and then we'd ha

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Matthias J. Sax
About `LogAndThresholdExceptionHandler`: If the handler needs to keep track of number of failed messages, than it becomes stateful -- not sure if we should do that. But maybe we can introduce 2 metrics (might be an interesting metric to report to the user anyway) and allow programmatic access to t

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Eno Thereska
Thanks Jan, The record passed to the handler will always be the problematic record. There are 2 cases/types of exceptions for the purposes of this KIP: 1) any exception during deserialization. The bad record + the exception (i.e. DeserializeException) will be passed to the handler. The handler

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-26 Thread Eno Thereska
Replying to Avi's and Matthias' questions in one go inline: > On 25 May 2017, at 19:27, Matthias J. Sax wrote: > > Thanks for the KIP Eno! > > Couple of comments: > > I think we don't need `RecordContext` in `RecordExceptionHandler#handle` > because the `ConsumerRecord` provides all this infor

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Avi Flax
> On May 25, 2017, at 14:27, Matthias J. Sax wrote: > > Thanks for the KIP Eno! Yes, thank you Eno! It’s very welcome and heartening to see work on this front. I have a few follow-ups to Matthias’ comments: > Why we introduce `ExceptionType` and not just hand in the actual exception? I was w

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Matthias J. Sax
Thanks for the KIP Eno! Couple of comments: I think we don't need `RecordContext` in `RecordExceptionHandler#handle` because the `ConsumerRecord` provides all this information anyway. Why we introduce `ExceptionType` and not just hand in the actual exception? As return type of `handle()` is voi

[DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Eno Thereska
Hi there, I’ve added a KIP on improving exception handling in streams: KIP-161: streams record processing exception handlers. https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+record+processing+exception+handlers