Bump! It's been three days here and I haven't seen any further feedback. Eager to get this resolved, approved, and merged. =)
On Tue, Nov 28, 2017 at 9:53 AM Matt Farmer <m...@frmr.me> wrote: > Hi there, sorry for the delay in responding. Last week had a holiday and > several busy work days in it so I'm just now getting around to responding. > > > We would only exclude > > exception Streams can handle itself (like ProducerFencedException) -- > > thus, if the handler has code to react to this, it would not be bad, as > > this code is just never called. > [...] > > Thus, I am still in favor of not calling the ProductionExceptionHandler > > for fatal exception. > > Acknowledged, so is ProducerFencedException the only kind of exception I > need to change my behavior on? Or are there other types I need to check? Is > there a comprehensive list somewhere? > > > About the "always continue" case. Sounds good to me to remove it (not > > sure why we need it in test package?) > > I include it in the test package because I have tests that assert that if > the record collector impl encounters an Exception and receives a CONTINUE > that it actually does CONTINUE. > > > What is there reasoning for invoking the handler only for the first > > exception? > > I didn't want to invoke the handler in places where the CONTINUE or FAIL > result would just be ignored. Presumably, after a FAIL has been returned, > subsequent exceptions are likely to be repeats or noise from my > understanding of the code paths. If this is incorrect we can revisit. > > Once I get the answers to these questions I can make another pass on the > pull request! > > Matt > > On Mon, Nov 20, 2017 at 4:07 PM Matthias J. Sax <matth...@confluent.io> > wrote: > >> Thanks for following up! >> >> One thought about an older reply from you: >> >> >>>> I strongly disagree here. The purpose of this handler isn't *just* to >> >>>> make a decision for streams. There may also be desirable side >> effects that >> >>>> users wish to cause when production exceptions occur. There may be >> >>>> side-effects that they wish to cause when AuthenticationExceptions >> occur, >> >>>> as well. We should still give them the hooks to preform those side >> effects. >> >> And your follow up: >> >> >> - I think I would rather invoke it for all exceptions that could >> occur >> >> from attempting to produce - even those exceptions were returning >> CONTINUE >> >> may not be a good idea (e.g. Authorization exception). Until there >> is a >> >> different type for exceptions that are totally fatal (for example a >> >> FatalStreamsException or some sort), maintaining a list of >> exceptions that >> >> you can intercept with this handler and exceptions you cannot would >> be >> >> really error-prone and hard to keep correct. >> >> I understand what you are saying, however, consider that Streams needs >> to die for a fatal exception. Thus, if you call the handler for a fatal >> exception, we would need to ignore the return value and fail -- this >> seems to be rather intuitive. Furthermore, users can register an >> uncaught-exception-handler and side effects for fatal errors can be >> triggered there. >> >> Btw: an AuthorizationException is fatal -- not sure what you mean by an >> "totally fatal" exception -- there is no superlative to fatal from my >> understanding. >> >> About maintaining a list of exceptions: I don't think this is too hard, >> and users also don't need to worry about it IMHO. We would only exclude >> exception Streams can handle itself (like ProducerFencedException) -- >> thus, if the handler has code to react to this, it would not be bad, as >> this code is just never called. In case that there is an exception >> Streams could actually handle but we still call the handler (ie, bug), >> there should be no harm either -- the handler needs to return either >> CONTINUE or FAIL and we would obey. It could only happen, that Streams >> dies---as request by the user(!)---even if Streams could actually handle >> the error and move on. But this seems to be not a issue from my point of >> view. >> >> Thus, I am still in favor of not calling the ProductionExceptionHandler >> for fatal exception. >> >> >> >> About the "always continue" case. Sounds good to me to remove it (not >> sure why we need it in test package?) and to rename the "failing" >> handler to "Default" (even if "default" is less descriptive and I would >> still prefer "Fail" in the name). >> >> >> Last question: >> >> >> - Continue to *only* invoke it on the first exception that we >> >> encounter (before sendException is set) >> >> >> What is there reasoning for invoking the handler only for the first >> exception? >> >> >> >> >> -Matthias >> >> On 11/20/17 10:43 AM, Matt Farmer wrote: >> > Alright, here are some updates I'm planning to make after thinking on >> this >> > for awhile: >> > >> > - Given that the "always continue" handler isn't something I'd >> recommend >> > for production use as is, I'm going to move it into the test >> namespace and >> > remove it from mention in the public API. >> > - I'm going to rename the "AlwaysFailProductionExceptionHandler" to >> > "DefaultProductionExceptionHandler" >> > - Given that the API for the exception handler involves being >> invoked by >> > streams to make a decision about whether or not to continue — I >> think that >> > we should: >> > - Continue to *only* invoke it on the first exception that we >> > encounter (before sendException is set) >> > - Stop invoking it for the self-healing fenced exceptions. >> > - I think I would rather invoke it for all exceptions that could >> occur >> > from attempting to produce - even those exceptions were returning >> CONTINUE >> > may not be a good idea (e.g. Authorization exception). Until there >> is a >> > different type for exceptions that are totally fatal (for example a >> > FatalStreamsException or some sort), maintaining a list of >> exceptions that >> > you can intercept with this handler and exceptions you cannot would >> be >> > really error-prone and hard to keep correct. >> > - I'm happy to file a KIP for the creation of this new Exception >> type >> > if there is interest. >> > >> > @ Matthias — What do you think about the above? >> > >> > On Tue, Nov 14, 2017 at 9:44 AM Matt Farmer <m...@frmr.me> wrote: >> > >> >> I responded before reading your code review and didn't see the bit >> about >> >> how ProducerFencedException is self-healing. This error handling logic >> is >> >> *quite* confusing to reason about... I think I'm going to sit down with >> >> the code a bit more today, but I'm inclined to think that if the fenced >> >> exceptions are, indeed, self healing that we still invoke the handler >> but >> >> ignore its result for those exceptions. >> >> >> >> On Tue, Nov 14, 2017 at 9:37 AM Matt Farmer <m...@frmr.me> wrote: >> >> >> >>> Hi there, >> >>> >> >>> Following up here... >> >>> >> >>>> One tiny comment: I would prefer to remove the "Always" from the >> >>> handler implementation names -- it sounds "cleaner" to me without it. >> >>> >> >>> Let me think on this. I generally prefer expressiveness to clean-ness, >> >>> and I think that comes out in the names I chose for things. The straw >> man >> >>> in my head is the person that tried to substitute in the >> "AlwaysContinue" >> >>> variant and the "Always" is a trigger to really consider whether or >> not >> >>> they *always* want to try to continue. >> >>> >> >>> To be truthful, after some thought, using the "AlwaysContinue" variant >> >>> isn't something I'd recommend generally in a production system. >> Ideally >> >>> these handlers are targeted at handling a specific series of >> exceptions >> >>> that a user wants to ignore, and not ignoring all exceptions. More on >> this >> >>> in a minute. >> >>> >> >>>> For the first category, it seems to not make sense to call the >> handle but >> >>> Streams should always fail. If we follow this design, the KIP should >> list >> >>> all exceptions for which the handler is not called. >> >>> >> >>> I strongly disagree here. The purpose of this handler isn't *just* to >> >>> make a decision for streams. There may also be desirable side effects >> that >> >>> users wish to cause when production exceptions occur. There may be >> >>> side-effects that they wish to cause when AuthenticationExceptions >> occur, >> >>> as well. We should still give them the hooks to preform those side >> effects. >> >>> >> >>> In light of the above, I'm thinking that the >> >>> "AlwaysContinueProductionExceptionHandler" variant should probably be >> >>> removed from the public API and moved into tests since that's where >> it's >> >>> most useful. The more I think on it, the more I feel that having that >> in >> >>> the public API is a landmine. If you agree, then, we can rename the >> >>> "AlwaysFailProductionExceptionHandler" to >> >>> "DefaultProductionExceptionHandler". >> >>> >> >>> Thoughts? >> >>> >> >>> On Fri, Nov 10, 2017 at 6:13 PM Matthias J. Sax < >> matth...@confluent.io> >> >>> wrote: >> >>> >> >>>> I just review the PR, and there is one thing we should discuss. >> >>>> >> >>>> There are different types of exceptions that could occur. Some apply >> to >> >>>> all records (like Authorization exception) while others are for >> single >> >>>> records only (like record too large). >> >>>> >> >>>> For the first category, it seems to not make sense to call the handle >> >>>> but Streams should always fail. If we follow this design, the KIP >> should >> >>>> list all exceptions for which the handler is not called. >> >>>> >> >>>> WDYT? >> >>>> >> >>>> >> >>>> -Matthias >> >>>> >> >>>> >> >>>> On 11/10/17 2:56 PM, Matthias J. Sax wrote: >> >>>>> Just catching up on this KIP. >> >>>>> >> >>>>> One tiny comment: I would prefer to remove the "Always" from the >> >>>> handler >> >>>>> implementation names -- it sounds "cleaner" to me without it. >> >>>>> >> >>>>> >> >>>>> -Matthias >> >>>>> >> >>>>> On 11/5/17 12:57 PM, Matt Farmer wrote: >> >>>>>> It is agreed, then. I've updated the pull request. I'm trying to >> also >> >>>>>> update the KIP accordingly, but cwiki is being slow and dropping >> >>>>>> connections..... I'll try again a bit later but please consider the >> >>>> KIP >> >>>>>> updated for all intents and purposes. heh. >> >>>>>> >> >>>>>> On Sun, Nov 5, 2017 at 3:45 PM Guozhang Wang <wangg...@gmail.com> >> >>>> wrote: >> >>>>>> >> >>>>>>> That makes sense. >> >>>>>>> >> >>>>>>> >> >>>>>>> Guozhang >> >>>>>>> >> >>>>>>> On Sun, Nov 5, 2017 at 12:33 PM, Matt Farmer <m...@frmr.me> >> wrote: >> >>>>>>> >> >>>>>>>> Interesting. I'm not sure I agree. I've been bitten many times by >> >>>>>>>> unintentionally shipping code that fails to properly implement >> >>>> logging. I >> >>>>>>>> always discover this at the exact *worst* moment, too. (Normally >> at >> >>>> 3 AM >> >>>>>>>> during an on-call shift. Hah.) However, if others feel the same >> way >> >>>> I >> >>>>>>> could >> >>>>>>>> probably be convinced to remove it. >> >>>>>>>> >> >>>>>>>> We could also meet halfway and say that when a customized >> >>>>>>>> ProductionExceptionHandler instructs Streams to CONTINUE, we log >> at >> >>>> DEBUG >> >>>>>>>> level instead of WARN level. Would that alternative be appealing >> to >> >>>> you? >> >>>>>>>> >> >>>>>>>> On Sun, Nov 5, 2017 at 12:32 PM Guozhang Wang < >> wangg...@gmail.com> >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Thanks for the updates. I made a pass over the wiki again and it >> >>>> looks >> >>>>>>>>> good. >> >>>>>>>>> >> >>>>>>>>> About whether record collector should still internally log the >> >>>> error in >> >>>>>>>>> addition to what the customized ProductionExceptionHandler >> does. I >> >>>>>>>>> personally would prefer only to log if the returned value is >> FAIL >> >>>> to >> >>>>>>>>> indicate that this thread is going to shutdown and trigger the >> >>>>>>> exception >> >>>>>>>>> handler. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Guozhang >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Sun, Nov 5, 2017 at 6:09 AM, Matt Farmer <m...@frmr.me> >> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hello, a bit later than I'd anticipated, but I've updated this >> >>>> KIP as >> >>>>>>>>>> outlined above. The updated KIP is now ready for review again! >> >>>>>>>>>> >> >>>>>>>>>> On Sat, Nov 4, 2017 at 1:03 PM Matt Farmer <m...@frmr.me> >> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> Ah. I actually created both of those in the PR and forgot to >> >>>>>>> mention >> >>>>>>>>> them >> >>>>>>>>>>> by name in the KIP! Thanks for pointing out the oversight. >> >>>>>>>>>>> >> >>>>>>>>>>> I’ll revise the KIP this afternoon accordingly. >> >>>>>>>>>>> >> >>>>>>>>>>> The logging is actually provided for in the record collector. >> >>>>>>>> Whenever >> >>>>>>>>> a >> >>>>>>>>>>> handler continues it’ll log a warning to ensure that it’s >> >>>>>>>> *impossible* >> >>>>>>>>> to >> >>>>>>>>>>> write a handler that totally suppresses production exceptions >> >>>> from >> >>>>>>>> the >> >>>>>>>>>> log. >> >>>>>>>>>>> As such, the default continue handler just returns the >> continue >> >>>>>>>> value. >> >>>>>>>>> I >> >>>>>>>>>>> can add details about those semantics to the KIP as well. >> >>>>>>>>>>> On Sat, Nov 4, 2017 at 12:46 PM Matthias J. Sax < >> >>>>>>>> matth...@confluent.io >> >>>>>>>>>> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> One more comment. >> >>>>>>>>>>>> >> >>>>>>>>>>>> You mention a default implementation for the handler that >> >>>> fails. I >> >>>>>>>>>>>> think, this should be part of the public API and thus should >> >>>> have >> >>>>>>> a >> >>>>>>>>>>>> proper defined named that is mentioned in the KIP. >> >>>>>>>>>>>> >> >>>>>>>>>>>> We could also add a second implementation for the >> >>>> log-and-move-on >> >>>>>>>>>>>> strategy, as both are the two most common cases. This class >> >>>> should >> >>>>>>>>> also >> >>>>>>>>>>>> be part of public API (so users can just set in the config) >> >>>> with a >> >>>>>>>>>>>> proper name. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> Otherwise, I like the KIP a lot! Thanks. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> -Matthias >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On 11/1/17 12:23 AM, Matt Farmer wrote: >> >>>>>>>>>>>>> Thanks for the heads up. Yes, I think my changes are >> compatible >> >>>>>>>> with >> >>>>>>>>>>>> that >> >>>>>>>>>>>>> PR, but there will be a merge conflict that happens whenever >> >>>> one >> >>>>>>>> of >> >>>>>>>>>> the >> >>>>>>>>>>>> PRs >> >>>>>>>>>>>>> is merged. Happy to reconcile the changes in my PR if 4148 >> goes >> >>>>>>> in >> >>>>>>>>>>>> first. :) >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Tue, Oct 31, 2017 at 6:44 PM Guozhang Wang < >> >>>>>>> wangg...@gmail.com >> >>>>>>>>> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> That sounds reasonable, thanks Matt. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> As for the implementation, please note that there is >> another >> >>>>>>>>> ongoing >> >>>>>>>>>> PR >> >>>>>>>>>>>>>> that may touch the same classes that you are working on: >> >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4148 >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> So it may help if you can also take a look at that PR and >> see >> >>>>>>> if >> >>>>>>>> it >> >>>>>>>>>> is >> >>>>>>>>>>>>>> compatible with your changes. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Guozhang >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Tue, Oct 31, 2017 at 10:59 AM, Matt Farmer < >> m...@frmr.me> >> >>>>>>>>> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> I've opened this pull request to implement the KIP as >> >>>>>>> currently >> >>>>>>>>>>>> written: >> >>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4165. It still needs >> >>>>>>> some >> >>>>>>>>>> tests >> >>>>>>>>>>>>>>> added, >> >>>>>>>>>>>>>>> but largely represents the shape I was going for. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> If there are more points that folks would like to discuss, >> >>>>>>>> please >> >>>>>>>>>> let >> >>>>>>>>>>>> me >> >>>>>>>>>>>>>>> know. If I don't hear anything by tomorrow afternoon I'll >> >>>>>>>> probably >> >>>>>>>>>>>> start >> >>>>>>>>>>>>>> a >> >>>>>>>>>>>>>>> [VOTE] thread. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>>>> Matt >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:33 PM Matt Farmer <m...@frmr.me >> > >> >>>>>>>> wrote: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> I can’t think of a reason that would be problematic. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Most of the time I would write a handler like this, I >> either >> >>>>>>>> want >> >>>>>>>>>> to >> >>>>>>>>>>>>>>>> ignore the error or fail and bring everything down so >> that I >> >>>>>>>> can >> >>>>>>>>>> spin >> >>>>>>>>>>>>>> it >> >>>>>>>>>>>>>>>> back up later and resume from earlier offsets. When we >> start >> >>>>>>> up >> >>>>>>>>>> after >> >>>>>>>>>>>>>>>> crashing we’ll eventually try to process the message we >> >>>>>>> failed >> >>>>>>>> to >> >>>>>>>>>>>>>> produce >> >>>>>>>>>>>>>>>> again. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> I’m concerned that “putting in a queue for later” opens >> you >> >>>>>>> up >> >>>>>>>> to >> >>>>>>>>>>>>>> putting >> >>>>>>>>>>>>>>>> messages into the destination topic in an unexpected >> order. >> >>>>>>>>> However >> >>>>>>>>>>>> if >> >>>>>>>>>>>>>>>> others feel differently, I’m happy to talk about it. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:10 PM Guozhang Wang < >> >>>>>>>>> wangg...@gmail.com> >> >>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is >> >>>>>>> that >> >>>>>>>>> the >> >>>>>>>>>>>>>>> record >> >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while >> >>>>>>> trying >> >>>>>>>>> to >> >>>>>>>>>>>>>>>>> produce. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> That is right. Thanks. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> I looked at the example code, and one thing I realized >> that >> >>>>>>>>> since >> >>>>>>>>>> we >> >>>>>>>>>>>>>> are >> >>>>>>>>>>>>>>>>> not passing the context in the handle function, we may >> not >> >>>>>>> be >> >>>>>>>>>>>>>> implement >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> logic to send the fail records into another queue for >> >>>> future >> >>>>>>>>>>>>>> processing. >> >>>>>>>>>>>>>>>>> Would people think that would be a big issue? >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Guozhang >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> On Thu, Oct 26, 2017 at 12:14 PM, Matt Farmer < >> >>>> m...@frmr.me >> >>>>>>>> >> >>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Hello all, >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> I've updated the KIP based on this conversation, and >> made >> >>>>>>> it >> >>>>>>>> so >> >>>>>>>>>>>> that >> >>>>>>>>>>>>>>> its >> >>>>>>>>>>>>>>>>>> interface, config setting, and parameters line up more >> >>>>>>>> closely >> >>>>>>>>>> with >> >>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>>> interface in KIP-161 (deserialization handler). >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> I believe there are a few specific questions I need to >> >>>>>>> reply >> >>>>>>>>>>>> to..... >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are >> >>>> around >> >>>>>>>> the >> >>>>>>>>>>>>>>> record, >> >>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be >> >>>>>>>> generics >> >>>>>>>>> of >> >>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or >> >>>>>>>>> `ProducerRecord<? >> >>>>>>>>>>>>>>>>> extends >> >>>>>>>>>>>>>>>>>>> Object, ? extends Object>`? >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> At this point in the code we're guaranteed that this >> is a >> >>>>>>>>>>>>>>>>>> ProducerRecord<byte[], byte[]>, so the generics would >> just >> >>>>>>>> make >> >>>>>>>>>> it >> >>>>>>>>>>>>>>>>> harder >> >>>>>>>>>>>>>>>>>> to work with the key and value. >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Also, should the handle function include the >> >>>>>>>> `RecordMetadata` >> >>>>>>>>> as >> >>>>>>>>>>>>>>> well >> >>>>>>>>>>>>>>>>> in >> >>>>>>>>>>>>>>>>>>> case it is not null? >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is >> >>>>>>> that >> >>>>>>>>> the >> >>>>>>>>>>>>>>> record >> >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while >> >>>>>>> trying >> >>>>>>>>> to >> >>>>>>>>>>>>>>>>> produce. >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> We may probably try to write down at least the >> following >> >>>>>>>>>> handling >> >>>>>>>>>>>>>>>>> logic >> >>>>>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> I've added some examples to the KIP. Let me know what >> you >> >>>>>>>>> think. >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Cheers, >> >>>>>>>>>>>>>>>>>> Matt >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 9:00 PM Matt Farmer < >> m...@frmr.me >> >>>>> >> >>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Thanks for this feedback. I’m at a conference right >> now >> >>>>>>> and >> >>>>>>>> am >> >>>>>>>>>>>>>>>>> planning >> >>>>>>>>>>>>>>>>>> on >> >>>>>>>>>>>>>>>>>>> updating the KIP again with details from this >> >>>> conversation >> >>>>>>>>> later >> >>>>>>>>>>>>>>> this >> >>>>>>>>>>>>>>>>>> week. >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> I’ll shoot you a more detailed response then! :) >> >>>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 8:16 PM Guozhang Wang < >> >>>>>>>>>> wangg...@gmail.com >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> Thanks for the KIP Matt. >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> Regarding the handle interface of >> >>>>>>>>>> ProductionExceptionHandlerResp >> >>>>>>>>>>>>>>> onse, >> >>>>>>>>>>>>>>>>>>>> could >> >>>>>>>>>>>>>>>>>>>> you write it on the wiki also, along with the actual >> >>>>>>> added >> >>>>>>>>>> config >> >>>>>>>>>>>>>>>>> names >> >>>>>>>>>>>>>>>>>>>> (e.g. what >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> >>>>>>>> 161%3A+streams+ >> >>>>>>>>>>>>>>>>>> deserialization+exception+handlers >> >>>>>>>>>>>>>>>>>>>> described). >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are >> >>>>>>> around >> >>>>>>>>> the >> >>>>>>>>>>>>>>>>> record, >> >>>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be >> >>>>>>>> generics >> >>>>>>>>>> of >> >>>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or >> >>>>>>>>> `ProducerRecord<? >> >>>>>>>>>>>>>>>>> extends >> >>>>>>>>>>>>>>>>>>>> Object, ? extends Object>`? >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> Also, should the handle function include the >> >>>>>>>> `RecordMetadata` >> >>>>>>>>>> as >> >>>>>>>>>>>>>>>>> well in >> >>>>>>>>>>>>>>>>>>>> case it is not null? >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> We may probably try to write down at least the >> following >> >>>>>>>>>> handling >> >>>>>>>>>>>>>>>>> logic >> >>>>>>>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it: 1) throw >> >>>>>>>> exception >> >>>>>>>>>>>>>>>>>> immediately >> >>>>>>>>>>>>>>>>>>>> to fail fast and stop the world, 2) log the error and >> >>>>>>> drop >> >>>>>>>>>> record >> >>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>>>> proceed silently, 3) send such errors to a specific >> >>>>>>> "error" >> >>>>>>>>>> Kafka >> >>>>>>>>>>>>>>>>> topic, >> >>>>>>>>>>>>>>>>>>>> or >> >>>>>>>>>>>>>>>>>>>> record it as an app-level metrics ( >> >>>>>>>>>>>>>>>>>>>> >> https://kafka.apache.org/documentation/#kafka_streams_ >> >>>>>>>>>> monitoring >> >>>>>>>>>>>>>> ) >> >>>>>>>>>>>>>>>>> for >> >>>>>>>>>>>>>>>>>>>> monitoring. >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> Guozhang >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 5:47 PM, Matt Farmer < >> >>>>>>> m...@frmr.me >> >>>>>>>>> >> >>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> I did some more digging tonight. >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> @Ted: It looks like the deserialization handler uses >> >>>>>>>>>>>>>>>>>>>>> "default.deserialization.exception.handler" for the >> >>>>>>>> config >> >>>>>>>>>>>>>>> name. No >> >>>>>>>>>>>>>>>>>>>>> ".class" on the end. I'm inclined to think this >> should >> >>>>>>> use >> >>>>>>>>>>>>>>>>>>>>> "default.production.exception.handler". >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 8:22 PM Matt Farmer < >> >>>>>>> m...@frmr.me >> >>>>>>>>> >> >>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> Okay, I've dug into this a little bit. >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> I think getting access to the serialized record is >> >>>>>>>>> possible, >> >>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>>>> changing >> >>>>>>>>>>>>>>>>>>>>>> the naming and return type is certainly doable. >> >>>>>>> However, >> >>>>>>>>>>>>>>> because >> >>>>>>>>>>>>>>>>>> we're >> >>>>>>>>>>>>>>>>>>>>>> hooking into the onCompletion callback we have no >> >>>>>>>> guarantee >> >>>>>>>>>>>>>>> that >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>>>>>>> ProcessorContext state hasn't changed by the time >> this >> >>>>>>>>>>>>>>> particular >> >>>>>>>>>>>>>>>>>>>> handler >> >>>>>>>>>>>>>>>>>>>>>> runs. So I think the signature would change to >> >>>>>>> something >> >>>>>>>>>>>>>> like: >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> ProductionExceptionHandlerResponse handle(final >> >>>>>>>>>>>>>>>>> ProducerRecord<..> >> >>>>>>>>>>>>>>>>>>>>> record, >> >>>>>>>>>>>>>>>>>>>>>> final Exception exception) >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> Would this be acceptable? >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:33 PM Matt Farmer < >> >>>>>>>> m...@frmr.me> >> >>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>> Ah good idea. Hmmm. I can line up the naming and >> >>>>>>> return >> >>>>>>>>> type >> >>>>>>>>>>>>>>> but >> >>>>>>>>>>>>>>>>>> I’m >> >>>>>>>>>>>>>>>>>>>> not >> >>>>>>>>>>>>>>>>>>>>>>> sure if I can get my hands on the context and the >> >>>>>>> record >> >>>>>>>>>>>>>>> itself >> >>>>>>>>>>>>>>>>>>>> without >> >>>>>>>>>>>>>>>>>>>>>>> other changes. >> >>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>> Let me dig in and follow up here tomorrow. >> >>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:14 PM Matthias J. Sax < >> >>>>>>>>>>>>>>>>>>>> matth...@confluent.io> >> >>>>>>>>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> Are you familiar with KIP-161? >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> >>>>>>>> 161%3A+streams+ >> >>>>>>>>>>>>>>>>>>>>> deserialization+exception+handlers >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> I thinks, we should align the design (parameter >> >>>>>>> naming, >> >>>>>>>>>>>>>>> return >> >>>>>>>>>>>>>>>>>>>> types, >> >>>>>>>>>>>>>>>>>>>>>>>> class names etc) of KIP-210 to KIP-161 to get a >> >>>>>>> unified >> >>>>>>>>>>>>>> user >> >>>>>>>>>>>>>>>>>>>>> experience. >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> -Matthias >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> On 10/18/17 4:20 PM, Matt Farmer wrote: >> >>>>>>>>>>>>>>>>>>>>>>>>> I’ll create the JIRA ticket. >> >>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>> I think that config name will work. I’ll update >> the >> >>>>>>>> KIP >> >>>>>>>>>>>>>>>>>>>> accordingly. >> >>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 6:09 PM Ted Yu < >> >>>>>>>>>>>>>>> yuzhih...@gmail.com> >> >>>>>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>> Can you create JIRA that corresponds to the >> KIP ? >> >>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>> For the new config, how about naming it >> >>>>>>>>>>>>>>>>>>>>>>>>>> production.exception.processor.class >> >>>>>>>>>>>>>>>>>>>>>>>>>> ? This way it is clear that class name should >> be >> >>>>>>>>>>>>>>> specified. >> >>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>> Cheers >> >>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer < >> >>>>>>>>>>>>>>> m...@frmr.me> >> >>>>>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone, >> >>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>>> This is the discussion thread for the KIP >> that I >> >>>>>>>> just >> >>>>>>>>>>>>>>> filed >> >>>>>>>>>>>>>>>>>>>> here: >> >>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> >>>>>>>>>>>>>>>>>>>>>>>>>>> 210+-+Provide+for+custom+ >> >>>>>>>> error+handling++when+Kafka+ >> >>>>>>>>>>>>>>>>>>>>>>>>>>> Streams+fails+to+produce >> >>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to getting some feedback from >> >>>>>>> folks >> >>>>>>>>>>>>>> about >> >>>>>>>>>>>>>>>>> this >> >>>>>>>>>>>>>>>>>>>> idea >> >>>>>>>>>>>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>>>>>>>>>>> working toward a solution we can contribute >> back. >> >>>>>>> :) >> >>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, >> >>>>>>>>>>>>>>>>>>>>>>>>>>> Matt Farmer >> >>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>>>>> -- Guozhang >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>> -- Guozhang >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>> -- Guozhang >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> -- Guozhang >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> -- Guozhang >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>>> >> > >> >>