Just catching up on this KIP.

One tiny comment: I would prefer to remove the "Always" from the handler
implementation names -- it sounds "cleaner" to me without it.

-Matthias

On 11/5/17 12:57 PM, Matt Farmer wrote:
> It is agreed, then. I've updated the pull request. I'm trying to also
> update the KIP accordingly, but cwiki is being slow and dropping
> connections..... I'll try again a bit later but please consider the KIP
> updated for all intents and purposes. heh.
> 
> On Sun, Nov 5, 2017 at 3:45 PM Guozhang Wang <wangg...@gmail.com> wrote:
> 
>> That makes sense.
>>
>>
>> Guozhang
>>
>> On Sun, Nov 5, 2017 at 12:33 PM, Matt Farmer <m...@frmr.me> wrote:
>>
>>> Interesting. I'm not sure I agree. I've been bitten many times by
>>> unintentionally shipping code that fails to properly implement logging. I
>>> always discover this at the exact *worst* moment, too. (Normally at 3 AM
>>> during an on-call shift. Hah.) However, if others feel the same way I
>> could
>>> probably be convinced to remove it.
>>>
>>> We could also meet halfway and say that when a customized
>>> ProductionExceptionHandler instructs Streams to CONTINUE, we log at DEBUG
>>> level instead of WARN level. Would that alternative be appealing to you?
>>>
>>> On Sun, Nov 5, 2017 at 12:32 PM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>>
>>>> Thanks for the updates. I made a pass over the wiki again and it looks
>>>> good.
>>>>
>>>> About whether record collector should still internally log the error in
>>>> addition to what the customized ProductionExceptionHandler does. I
>>>> personally would prefer only to log if the returned value is FAIL to
>>>> indicate that this thread is going to shutdown and trigger the
>> exception
>>>> handler.
>>>>
>>>>
>>>> Guozhang
>>>>
>>>>
>>>> On Sun, Nov 5, 2017 at 6:09 AM, Matt Farmer <m...@frmr.me> wrote:
>>>>
>>>>> Hello, a bit later than I'd anticipated, but I've updated this KIP as
>>>>> outlined above. The updated KIP is now ready for review again!
>>>>>
>>>>> On Sat, Nov 4, 2017 at 1:03 PM Matt Farmer <m...@frmr.me> wrote:
>>>>>
>>>>>> Ah. I actually created both of those in the PR and forgot to
>> mention
>>>> them
>>>>>> by name in the KIP! Thanks for pointing out the oversight.
>>>>>>
>>>>>> I’ll revise the KIP this afternoon accordingly.
>>>>>>
>>>>>> The logging is actually provided for in the record collector.
>>> Whenever
>>>> a
>>>>>> handler continues it’ll log a warning to ensure that it’s
>>> *impossible*
>>>> to
>>>>>> write a handler that totally suppresses production exceptions from
>>> the
>>>>> log.
>>>>>> As such, the default continue handler just returns the continue
>>> value.
>>>> I
>>>>>> can add details about those semantics to the KIP as well.
>>>>>> On Sat, Nov 4, 2017 at 12:46 PM Matthias J. Sax <
>>> matth...@confluent.io
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> One more comment.
>>>>>>>
>>>>>>> You mention a default implementation for the handler that fails. I
>>>>>>> think, this should be part of the public API and thus should have
>> a
>>>>>>> proper defined named that is mentioned in the KIP.
>>>>>>>
>>>>>>> We could also add a second implementation for the log-and-move-on
>>>>>>> strategy, as both are the two most common cases. This class should
>>>> also
>>>>>>> be part of public API (so users can just set in the config) with a
>>>>>>> proper name.
>>>>>>>
>>>>>>>
>>>>>>> Otherwise, I like the KIP a lot! Thanks.
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>>
>>>>>>> On 11/1/17 12:23 AM, Matt Farmer wrote:
>>>>>>>> Thanks for the heads up. Yes, I think my changes are compatible
>>> with
>>>>>>> that
>>>>>>>> PR, but there will be a merge conflict that happens whenever one
>>> of
>>>>> the
>>>>>>> PRs
>>>>>>>> is merged. Happy to reconcile the changes in my PR if 4148 goes
>> in
>>>>>>> first. :)
>>>>>>>>
>>>>>>>> On Tue, Oct 31, 2017 at 6:44 PM Guozhang Wang <
>> wangg...@gmail.com
>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> That sounds reasonable, thanks Matt.
>>>>>>>>>
>>>>>>>>> As for the implementation, please note that there is another
>>>> ongoing
>>>>> PR
>>>>>>>>> that may touch the same classes that you are working on:
>>>>>>>>> https://github.com/apache/kafka/pull/4148
>>>>>>>>>
>>>>>>>>> So it may help if you can also take a look at that PR and see
>> if
>>> it
>>>>> is
>>>>>>>>> compatible with your changes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guozhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 31, 2017 at 10:59 AM, Matt Farmer <m...@frmr.me>
>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I've opened this pull request to implement the KIP as
>> currently
>>>>>>> written:
>>>>>>>>>> https://github.com/apache/kafka/pull/4165. It still needs
>> some
>>>>> tests
>>>>>>>>>> added,
>>>>>>>>>> but largely represents the shape I was going for.
>>>>>>>>>>
>>>>>>>>>> If there are more points that folks would like to discuss,
>>> please
>>>>> let
>>>>>>> me
>>>>>>>>>> know. If I don't hear anything by tomorrow afternoon I'll
>>> probably
>>>>>>> start
>>>>>>>>> a
>>>>>>>>>> [VOTE] thread.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Matt
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 27, 2017 at 7:33 PM Matt Farmer <m...@frmr.me>
>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I can’t think of a reason that would be problematic.
>>>>>>>>>>>
>>>>>>>>>>> Most of the time I would write a handler like this, I either
>>> want
>>>>> to
>>>>>>>>>>> ignore the error or fail and bring everything down so that I
>>> can
>>>>> spin
>>>>>>>>> it
>>>>>>>>>>> back up later and resume from earlier offsets. When we start
>> up
>>>>> after
>>>>>>>>>>> crashing we’ll eventually try to process the message we
>> failed
>>> to
>>>>>>>>> produce
>>>>>>>>>>> again.
>>>>>>>>>>>
>>>>>>>>>>> I’m concerned that “putting in a queue for later” opens you
>> up
>>> to
>>>>>>>>> putting
>>>>>>>>>>> messages into the destination topic in an unexpected order.
>>>> However
>>>>>>> if
>>>>>>>>>>> others feel differently, I’m happy to talk about it.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:10 PM Guozhang Wang <
>>>> wangg...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is
>> that
>>>> the
>>>>>>>>>> record
>>>>>>>>>>>>> metadata is always null if an exception occurred while
>> trying
>>>> to
>>>>>>>>>>>> produce.
>>>>>>>>>>>>
>>>>>>>>>>>> That is right. Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> I looked at the example code, and one thing I realized that
>>>> since
>>>>> we
>>>>>>>>> are
>>>>>>>>>>>> not passing the context in the handle function, we may not
>> be
>>>>>>>>> implement
>>>>>>>>>>>> the
>>>>>>>>>>>> logic to send the fail records into another queue for future
>>>>>>>>> processing.
>>>>>>>>>>>> Would people think that would be a big issue?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Oct 26, 2017 at 12:14 PM, Matt Farmer <m...@frmr.me
>>>
>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've updated the KIP based on this conversation, and made
>> it
>>> so
>>>>>>> that
>>>>>>>>>> its
>>>>>>>>>>>>> interface, config setting, and parameters line up more
>>> closely
>>>>> with
>>>>>>>>>> the
>>>>>>>>>>>>> interface in KIP-161 (deserialization handler).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I believe there are a few specific questions I need to
>> reply
>>>>>>> to.....
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The question I had about then handle parameters are around
>>> the
>>>>>>>>>> record,
>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be
>>> generics
>>>> of
>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or
>>>> `ProducerRecord<?
>>>>>>>>>>>> extends
>>>>>>>>>>>>>> Object, ? extends Object>`?
>>>>>>>>>>>>>
>>>>>>>>>>>>> At this point in the code we're guaranteed that this is a
>>>>>>>>>>>>> ProducerRecord<byte[], byte[]>, so the generics would just
>>> make
>>>>> it
>>>>>>>>>>>> harder
>>>>>>>>>>>>> to work with the key and value.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, should the handle function include the
>>> `RecordMetadata`
>>>> as
>>>>>>>>>> well
>>>>>>>>>>>> in
>>>>>>>>>>>>>> case it is not null?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is
>> that
>>>> the
>>>>>>>>>> record
>>>>>>>>>>>>> metadata is always null if an exception occurred while
>> trying
>>>> to
>>>>>>>>>>>> produce.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> We may probably try to write down at least the following
>>>>> handling
>>>>>>>>>>>> logic
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> see if the given API is sufficient for it
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've added some examples to the KIP. Let me know what you
>>>> think.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 9:00 PM Matt Farmer <m...@frmr.me>
>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for this feedback. I’m at a conference right now
>> and
>>> am
>>>>>>>>>>>> planning
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> updating the KIP again with details from this conversation
>>>> later
>>>>>>>>>> this
>>>>>>>>>>>>> week.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I’ll shoot you a more detailed response then! :)
>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 8:16 PM Guozhang Wang <
>>>>> wangg...@gmail.com
>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the KIP Matt.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regarding the handle interface of
>>>>> ProductionExceptionHandlerResp
>>>>>>>>>> onse,
>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> you write it on the wiki also, along with the actual
>> added
>>>>> config
>>>>>>>>>>>> names
>>>>>>>>>>>>>>> (e.g. what
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 161%3A+streams+
>>>>>>>>>>>>> deserialization+exception+handlers
>>>>>>>>>>>>>>> described).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The question I had about then handle parameters are
>> around
>>>> the
>>>>>>>>>>>> record,
>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be
>>> generics
>>>>> of
>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or
>>>> `ProducerRecord<?
>>>>>>>>>>>> extends
>>>>>>>>>>>>>>> Object, ? extends Object>`?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, should the handle function include the
>>> `RecordMetadata`
>>>>> as
>>>>>>>>>>>> well in
>>>>>>>>>>>>>>> case it is not null?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We may probably try to write down at least the following
>>>>> handling
>>>>>>>>>>>> logic
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> see if the given API is sufficient for it: 1) throw
>>> exception
>>>>>>>>>>>>> immediately
>>>>>>>>>>>>>>> to fail fast and stop the world, 2) log the error and
>> drop
>>>>> record
>>>>>>>>>> and
>>>>>>>>>>>>>>> proceed silently, 3) send such errors to a specific
>> "error"
>>>>> Kafka
>>>>>>>>>>>> topic,
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> record it as an app-level metrics (
>>>>>>>>>>>>>>> https://kafka.apache.org/documentation/#kafka_streams_
>>>>> monitoring
>>>>>>>>> )
>>>>>>>>>>>> for
>>>>>>>>>>>>>>> monitoring.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 5:47 PM, Matt Farmer <
>> m...@frmr.me
>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I did some more digging tonight.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @Ted: It looks like the deserialization handler uses
>>>>>>>>>>>>>>>> "default.deserialization.exception.handler" for the
>>> config
>>>>>>>>>> name. No
>>>>>>>>>>>>>>>> ".class" on the end. I'm inclined to think this should
>> use
>>>>>>>>>>>>>>>> "default.production.exception.handler".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 8:22 PM Matt Farmer <
>> m...@frmr.me
>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Okay, I've dug into this a little bit.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think getting access to the serialized record is
>>>> possible,
>>>>>>>>>> and
>>>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>>> the naming and return type is certainly doable.
>> However,
>>>>>>>>>> because
>>>>>>>>>>>>> we're
>>>>>>>>>>>>>>>>> hooking into the onCompletion callback we have no
>>> guarantee
>>>>>>>>>> that
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> ProcessorContext state hasn't changed by the time this
>>>>>>>>>> particular
>>>>>>>>>>>>>>> handler
>>>>>>>>>>>>>>>>> runs. So I think the signature would change to
>> something
>>>>>>>>> like:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ProductionExceptionHandlerResponse handle(final
>>>>>>>>>>>> ProducerRecord<..>
>>>>>>>>>>>>>>>> record,
>>>>>>>>>>>>>>>>> final Exception exception)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Would this be acceptable?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:33 PM Matt Farmer <
>>> m...@frmr.me>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Ah good idea. Hmmm. I can line up the naming and
>> return
>>>> type
>>>>>>>>>> but
>>>>>>>>>>>>> I’m
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> sure if I can get my hands on the context and the
>> record
>>>>>>>>>> itself
>>>>>>>>>>>>>>> without
>>>>>>>>>>>>>>>>>> other changes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Let me dig in and follow up here tomorrow.
>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:14 PM Matthias J. Sax <
>>>>>>>>>>>>>>> matth...@confluent.io>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Are you familiar with KIP-161?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 161%3A+streams+
>>>>>>>>>>>>>>>> deserialization+exception+handlers
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I thinks, we should align the design (parameter
>> naming,
>>>>>>>>>> return
>>>>>>>>>>>>>>> types,
>>>>>>>>>>>>>>>>>>> class names etc) of KIP-210 to KIP-161 to get a
>> unified
>>>>>>>>> user
>>>>>>>>>>>>>>>> experience.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 10/18/17 4:20 PM, Matt Farmer wrote:
>>>>>>>>>>>>>>>>>>>> I’ll create the JIRA ticket.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think that config name will work. I’ll update the
>>> KIP
>>>>>>>>>>>>>>> accordingly.
>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 6:09 PM Ted Yu <
>>>>>>>>>> yuzhih...@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Can you create JIRA that corresponds to the KIP ?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For the new config, how about naming it
>>>>>>>>>>>>>>>>>>>>> production.exception.processor.class
>>>>>>>>>>>>>>>>>>>>> ? This way it is clear that class name should be
>>>>>>>>>> specified.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer <
>>>>>>>>>> m...@frmr.me>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This is the discussion thread for the KIP that I
>>> just
>>>>>>>>>> filed
>>>>>>>>>>>>>>> here:
>>>>>>>>>>>>>>>>>>>>>>
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>>>>>>>>>>> 210+-+Provide+for+custom+
>>> error+handling++when+Kafka+
>>>>>>>>>>>>>>>>>>>>>> Streams+fails+to+produce
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Looking forward to getting some feedback from
>> folks
>>>>>>>>> about
>>>>>>>>>>>> this
>>>>>>>>>>>>>>> idea
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> working toward a solution we can contribute back.
>> :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>> Matt Farmer
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -- Guozhang
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>
>>
>>
>> --
>> -- Guozhang
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to