Re: [VOTE] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

Sarwar Bhuiyan Wed, 14 Apr 2021 01:26:51 -0700

Forgive me while I find my feet, but since this reached a VOTE stage, is it
really required to go through a DISCUSS thread again or can we continue to
address the issues on this thread since the issues were raised on this
thread? If I do have to create a new DISCUSS thread, do I just copy the
messages from the old thread to it to address them?


Thank you.

Sarwar

On Wed, Apr 14, 2021 at 9:18 AM Stanislav Kozlovski <stanis...@confluent.io>
wrote:

> Hey all,
>
> To revive this old KIP, Sarwar Bhuiyan has volunteered to take over
> ownership.
> He will continue to drive this KIP to approval and completion - I
> understand that he will re-start the discussion with a new [DISCUSS] or
> [VOTE] thread.
>
> Thank you Sarwar!
>
> Best,
> Stanislav
>
> On Fri, Jan 10, 2020 at 5:55 PM Gwen Shapira <g...@confluent.io> wrote:
>
>> Sorry for the super late reply, but since the vote thread showed up,
>> I've read the KIP and have a concern.
>> The concern was raised by Matthias Sax earlier, but I didn't see it
>> addressed.
>>
>> Basically the current iteration of the KIP unifies deserialization
>> errors with corruption. As was pointed out, these are not the same
>> thing. Corruption means that the message itself (including envelope,
>> not just the payload) is broken. De-serialization errors mean that
>> either key or value serializers have a problem. It can even be a
>> temporary problem of connecting to schema registry, I believe. Corrupt
>> messages can only be skipped. De-serialization errors can (and
>> arguably should) be retried with a different serializer. Something
>> like Connect will need to skip corrupt messages, but messages with
>> SerDe issues should probably go into a dead-letter queue.
>>
>> Anyway, IMO we need exceptions that will let us tell the difference.
>>
>> Gwen
>>
>> On Fri, Oct 11, 2019 at 10:05 AM Stanislav Kozlovski
>> <stanis...@confluent.io> wrote:
>> >
>> > Thanks Jason. I've edited the KIP with the latest proposal.
>> >
>> > On Thu, Oct 10, 2019 at 2:00 AM Jason Gustafson <ja...@confluent.io>
>> wrote:
>> >
>> > > Hi Stanislav,
>> > >
>> > > Sorry for the late comment. I'm happy with the current proposal. Just
>> one
>> > > small request is to include an example which shows how a user could
>> use
>> > > this to skip over a record.
>> > >
>> > > I'd suggest pushing this to a vote to see if anyone else has feedback.
>> > >
>> > > Thanks,
>> > > Jason
>> > >
>> > > On Sat, Jul 13, 2019 at 2:27 PM Stanislav Kozlovski <
>> > > stanis...@confluent.io>
>> > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > Could we restart the discussion here again?
>> > > >
>> > > > My last message was sent on the 3rd of June but I haven't received
>> > > replies
>> > > > since then.
>> > > >
>> > > > I'd like to get this KIP to a finished state, regardless of whether
>> that
>> > > is
>> > > > merged-in or discarded. It has been almost one year since the
>> publication
>> > > > of the KIP.
>> > > >
>> > > > Thanks,
>> > > > Stanislav
>> > > >
>> > > > On Mon, Jun 3, 2019 at 11:19 AM Stanislav Kozlovski <
>> > > > stanis...@confluent.io>
>> > > > wrote:
>> > > >
>> > > > > Do people agree with the approach I outlined in my last reply?
>> > > > >
>> > > > > On Mon, May 6, 2019 at 2:12 PM Stanislav Kozlovski <
>> > > > stanis...@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > >> Hey there Kamal,
>> > > > >>
>> > > > >> I'm sincerely sorry for missing your earlier message. As I open
>> this
>> > > > >> thread up, I see I have an unsent draft message about resuming
>> > > > discussion
>> > > > >> from some time ago.
>> > > > >>
>> > > > >> In retrospect, I think I may have been too pedantic with the
>> exception
>> > > > >> naming and hierarchy.
>> > > > >> I now believe a single exception type of
>> > > > `RecordDeserializationException`
>> > > > >> is enough. Let's go with that.
>> > > > >>
>> > > > >> On Mon, May 6, 2019 at 6:40 AM Kamal Chandraprakash <
>> > > > >> kamal.chandraprak...@gmail.com> wrote:
>> > > > >>
>> > > > >>> Matthias,
>> > > > >>>
>> > > > >>> We already have CorruptRecordException which doesn't extend the
>> > > > >>> SerializationException. So, we need an alternate
>> > > > >>> name suggestion for the corrupted record error if we decide to
>> extend
>> > > > the
>> > > > >>> FaultyRecordException class.
>> > > > >>>
>> > > > >>> Stanislav,
>> > > > >>>
>> > > > >>> Our users are also facing this error. Could we bump up this
>> > > discussion?
>> > > > >>>
>> > > > >>> I think we can have a single exception type
>> > > > >>> FaultyRecordException/RecordDeserialization exception to
>> capture both
>> > > > >>> the errors. We can add an additional enum field to
>> differentiate the
>> > > > >>> errors
>> > > > >>> if required.
>> > > > >>>
>> > > > >>> Thanks,
>> > > > >>> Kamal Chandraprakash
>> > > > >>>
>> > > > >>> On Wed, Apr 24, 2019 at 1:49 PM Kamal Chandraprakash <
>> > > > >>> kamal.chandraprak...@gmail.com> wrote:
>> > > > >>>
>> > > > >>> > Stanislav,
>> > > > >>> >
>> > > > >>> > Any updates on this KIP? We have internal users who want to
>> skip
>> > > the
>> > > > >>> > corrupted message while consuming the records.
>> > > > >>> >
>> > > > >>> >
>> > > > >>> > On Fri, Oct 19, 2018 at 11:34 PM Matthias J. Sax <
>> > > > >>> matth...@confluent.io>
>> > > > >>> > wrote:
>> > > > >>> >
>> > > > >>> >> I am not 100% familiar with the details of the consumer code,
>> > > > however
>> > > > >>> I
>> > > > >>> >> tend to disagree with:
>> > > > >>> >>
>> > > > >>> >> > There's no difference between the two cases -- if (and
>> only if)
>> > > > the
>> > > > >>> >> message is corrupt, it can't be deserialized.  If (and only
>> if) it
>> > > > >>> can't be
>> > > > >>> >> deserialized, it is corrupt.
>> > > > >>> >>
>> > > > >>> >> Assume that a user configures a JSON deserializer but a
>> faulty
>> > > > >>> upstream
>> > > > >>> >> producer writes an Avro message. For this case, the message
>> is not
>> > > > >>> >> corrupted, but still can't be deserialized. And I can
>> imaging that
>> > > > >>> users
>> > > > >>> >> want to handle both cases differently.
>> > > > >>> >>
>> > > > >>> >> Thus, I think it makes sense to have two different exceptions
>> > > > >>> >> `RecordDeserializationException` and
>> `CorruptedRecordException`
>> > > that
>> > > > >>> can
>> > > > >>> >> both extend `FaultyRecordException` (don't like this name
>> too much
>> > > > >>> >> honestly, but don't have a better idea for it anyway).
>> > > > >>> >>
>> > > > >>> >> Side remark. If we introduce class
>> > > `RecordDeserializationException`
>> > > > >>> and
>> > > > >>> >> `CorruptedRecordException`, we can also add an interface
>> that both
>> > > > >>> >> implement to return partition/offset information and let both
>> > > extend
>> > > > >>> >> `SerializationException` directly without an intermediate
>> class in
>> > > > the
>> > > > >>> >> exception hierarchy.
>> > > > >>> >>
>> > > > >>> >>
>> > > > >>> >> -Matthias
>> > > > >>> >>
>> > > > >>> >> On 8/8/18 2:57 AM, Stanislav Kozlovski wrote:
>> > > > >>> >> >> If you are inheriting from SerializationException, your
>> derived
>> > > > >>> class
>> > > > >>> >> > should also be a kind of serialization exception.  Not
>> something
>> > > > >>> more
>> > > > >>> >> > general.
>> > > > >>> >> > Yeah, the reason for inheriting it would be for
>> > > > >>> backwards-compatibility.
>> > > > >>> >> >
>> > > > >>> >> >> Hmm.  Can you think of any new scenarios that would make
>> Kafka
>> > > > >>> force
>> > > > >>> >> the
>> > > > >>> >> > user need to skip a specific record?  Perhaps one scenario
>> is if
>> > > > >>> records
>> > > > >>> >> > are lost but we don't know how many.
>> > > > >>> >> > Not on the spot, but I do wonder how likely a new scenario
>> is to
>> > > > >>> >> surface in
>> > > > >>> >> > the future and how we'd handle the exceptions' class
>> hierarchy
>> > > > then.
>> > > > >>> >> >
>> > > > >>> >> >> Which offset were we planning to use in the
>> > > > >>> >> > exception?
>> > > > >>> >> > The offset of the record which caused the exception. In
>> the case
>> > > > of
>> > > > >>> >> > batches, we use the last offset of the batch. In both
>> cases,
>> > > users
>> > > > >>> >> should
>> > > > >>> >> > have to seek +1 from the given offset. You can review the
>> PR to
>> > > > >>> ensure
>> > > > >>> >> its
>> > > > >>> >> > accurate
>> > > > >>> >> >
>> > > > >>> >> >
>> > > > >>> >> > If both of you prefer `RecordDeserializationException`, we
>> can
>> > > go
>> > > > >>> with
>> > > > >>> >> > that. Please do confirm that is okay
>> > > > >>> >> >
>> > > > >>> >> > On Tue, Aug 7, 2018 at 11:35 PM Jason Gustafson <
>> > > > ja...@confluent.io
>> > > > >>> >
>> > > > >>> >> wrote:
>> > > > >>> >> >
>> > > > >>> >> >> One difference between the two cases is that we can't
>> generally
>> > > > >>> trust
>> > > > >>> >> the
>> > > > >>> >> >> offset of a corrupt message. Which offset were we
>> planning to
>> > > use
>> > > > >>> in
>> > > > >>> >> the
>> > > > >>> >> >> exception? Maybe it should be either the fetch offset or
>> one
>> > > plus
>> > > > >>> the
>> > > > >>> >> last
>> > > > >>> >> >> consumed offset? I think I'm with Colin in preferring
>> > > > >>> >> >> RecordDeserializationException for both cases if
>> possible. For
>> > > > one
>> > > > >>> >> thing,
>> > > > >>> >> >> that makes the behavior consistent whether or not
>> `check.crs`
>> > > is
>> > > > >>> >> enabled.
>> > > > >>> >> >>
>> > > > >>> >> >> -Jason
>> > > > >>> >> >>
>> > > > >>> >> >> On Tue, Aug 7, 2018 at 11:17 AM, Colin McCabe <
>> > > > cmcc...@apache.org>
>> > > > >>> >> wrote:
>> > > > >>> >> >>
>> > > > >>> >> >>> Hi Stanislav,
>> > > > >>> >> >>>
>> > > > >>> >> >>> On Sat, Aug 4, 2018, at 10:44, Stanislav Kozlovski wrote:
>> > > > >>> >> >>>> Hey Colin,
>> > > > >>> >> >>>>
>> > > > >>> >> >>>> It may be a bit vague but keep in mind we also raise the
>> > > > >>> exception in
>> > > > >>> >> >> the
>> > > > >>> >> >>>> case where the record is corrupted.
>> > > > >>> >> >>>> We discussed with Jason offline that message corruption
>> most
>> > > > >>> often
>> > > > >>> >> >>> prevents
>> > > > >>> >> >>>> deserialization itself and that may be enough of an
>> argument
>> > > to
>> > > > >>> raise
>> > > > >>> >> >>>> `RecordDeserializationException` in the case of a
>> corrupt
>> > > > >>> record. I
>> > > > >>> >> >>>> personally find that misleading.
>> > > > >>> >> >>>
>> > > > >>> >> >>> Hmm.  I think that by definition, corrupt records are
>> records
>> > > > that
>> > > > >>> >> can't
>> > > > >>> >> >>> be deserialized.  There's no difference between the two
>> cases
>> > > --
>> > > > >>> if
>> > > > >>> >> (and
>> > > > >>> >> >>> only if) the message is corrupt, it can't be
>> deserialized.  If
>> > > > >>> (and
>> > > > >>> >> only
>> > > > >>> >> >>> if) it can't be deserialized, it is corrupt.
>> > > > >>> >> >>>
>> > > > >>> >> >>>>
>> > > > >>> >> >>>> In the end, I think it might be worth it to have a bit
>> of a
>> > > > >>> >> >>>> wider-encompassing `FaultyRecordException` (or even
>> > > > >>> >> >>>> `UnconsumableRecordException`) which would allow users
>> to
>> > > skip
>> > > > >>> over
>> > > > >>> >> the
>> > > > >>> >> >>>> record.
>> > > > >>> >> >>>
>> > > > >>> >> >>> If you are inheriting from SerializationException, your
>> > > derived
>> > > > >>> class
>> > > > >>> >> >>> should also be a kind of serialization exception.  Not
>> > > something
>> > > > >>> more
>> > > > >>> >> >>> general.
>> > > > >>> >> >>>
>> > > > >>> >> >>>> We could then potentially have more specific exceptions
>> (e.g
>> > > > >>> >> >>>> deserialization) inherit that but I'm not sure if we
>> should.
>> > > > >>> >> >>>> A case for a more general exception which provides
>> access to
>> > > > the
>> > > > >>> >> >>>> partition/offset is future backwards-compatibility. If
>> there
>> > > is
>> > > > >>> ever
>> > > > >>> >> a
>> > > > >>> >> >>> new
>> > > > >>> >> >>>> scenario that would make the user need to skip a
>> specific
>> > > > record
>> > > > >>> -
>> > > > >>> >> >> there
>> > > > >>> >> >>>> would already be such an exception and this will
>> provide some
>> > > > >>> >> >>>> backward-compatibility with older clients.
>> > > > >>> >> >>>
>> > > > >>> >> >>> Hmm.  Can you think of any new scenarios that would make
>> Kafka
>> > > > >>> force
>> > > > >>> >> the
>> > > > >>> >> >>> user need to skip a specific record?  Perhaps one
>> scenario is
>> > > if
>> > > > >>> >> records
>> > > > >>> >> >>> are lost but we don't know how many.
>> > > > >>> >> >>>
>> > > > >>> >> >>> If we really want to have something more general,
>> perhaps we
>> > > > need
>> > > > >>> >> >>> something like LostDataException.  But in that case, we
>> can't
>> > > > >>> inherit
>> > > > >>> >> >> from
>> > > > >>> >> >>> SerializationException, since lost data is more general
>> than
>> > > > >>> >> >> serialization
>> > > > >>> >> >>> issues.
>> > > > >>> >> >>>
>> > > > >>> >> >>> best,
>> > > > >>> >> >>> Colin
>> > > > >>> >> >>>
>> > > > >>> >> >>>
>> > > > >>> >> >>>>
>> > > > >>> >> >>>> Best,
>> > > > >>> >> >>>> Stanislav
>> > > > >>> >> >>>>
>> > > > >>> >> >>>> On Sat, Aug 4, 2018 at 12:23 AM Colin McCabe <
>> > > > cmcc...@apache.org
>> > > > >>> >
>> > > > >>> >> >> wrote:
>> > > > >>> >> >>>>
>> > > > >>> >> >>>>> Hi Stanislav,
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>> Thanks for the KIP.
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>> As as user, "FaultyRecordException" seems a little
>> vague.
>> > > > >>> What's
>> > > > >>> >> >>> faulty
>> > > > >>> >> >>>>> about it?  Is it too big?  Does it not match the
>> schema of
>> > > the
>> > > > >>> other
>> > > > >>> >> >>>>> records?  Was it not found?
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>> Of course, we know that the specific problem is with
>> > > > >>> [de]serializing
>> > > > >>> >> >>> the
>> > > > >>> >> >>>>> record, not with any of those those things.  So we
>> should
>> > > > >>> choose a
>> > > > >>> >> >> name
>> > > > >>> >> >>>>> that reflects that serialization is the problem.  How
>> about
>> > > > >>> >> >>>>> RecordSerializationException?
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>> best,
>> > > > >>> >> >>>>> Colin
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>> On Thu, Aug 2, 2018, at 15:11, Stanislav Kozlovski
>> wrote:
>> > > > >>> >> >>>>>> Hi Jason and Ted,
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>> @Jason
>> > > > >>> >> >>>>>> I did not oppose the idea as I'm not too familiar
>> with Java
>> > > > >>> >> >>> conventions.
>> > > > >>> >> >>>>> I
>> > > > >>> >> >>>>>> agree it is a non-ideal way for the user to catch the
>> > > > >>> exception so
>> > > > >>> >> >> I
>> > > > >>> >> >>>>>> changed it back.
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>> I've updated the KIP and PR with the latest changes.
>> Now,
>> > > > >>> there is
>> > > > >>> >> >>> only
>> > > > >>> >> >>>>> one
>> > > > >>> >> >>>>>> public exception `FaultyRecordException` which is
>> thrown to
>> > > > the
>> > > > >>> >> >> user
>> > > > >>> >> >>> in
>> > > > >>> >> >>>>>> both scenarios (corrupt record and deserialization
>> error).
>> > > > >>> >> >>>>>> Please take a look again.
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>> Best,
>> > > > >>> >> >>>>>> Stanislav
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>> On Thu, Aug 2, 2018 at 5:25 PM Jason Gustafson <
>> > > > >>> ja...@confluent.io
>> > > > >>> >> >>>
>> > > > >>> >> >>>>> wrote:
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>>> Hey Stanislav,
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>>> I should have noticed it earlier from your example,
>> but I
>> > > > just
>> > > > >>> >> >>> realized
>> > > > >>> >> >>>>>>> that interfaces don't mix well with exceptions.
>> There is
>> > > no
>> > > > >>> way
>> > > > >>> >> >> to
>> > > > >>> >> >>>>> catch
>> > > > >>> >> >>>>>>> the interface type, which means you have to depend on
>> > > > >>> instanceof
>> > > > >>> >> >>>>> checks,
>> > > > >>> >> >>>>>>> which is not very conventional. At the moment, we
>> raise
>> > > > >>> >> >>>>>>> SerializationException if there is a problem parsing
>> the
>> > > > >>> error,
>> > > > >>> >> >>> and we
>> > > > >>> >> >>>>>>> raise KafkaException if the record fails its CRC
>> check.
>> > > > Since
>> > > > >>> >> >>>>>>> SerializationException extends KafkaExeption, it
>> seems
>> > > like
>> > > > we
>> > > > >>> >> >> can
>> > > > >>> >> >>>>> handle
>> > > > >>> >> >>>>>>> both cases in a compatible way by handling both cases
>> > > with a
>> > > > >>> >> >> single
>> > > > >>> >> >>>>> type
>> > > > >>> >> >>>>>>> that extends SerializationException. Maybe something
>> like
>> > > > >>> >> >>>>>>> RecordDeserializationException?
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>>> Thanks,
>> > > > >>> >> >>>>>>> Jason
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>>> On Thu, Aug 2, 2018 at 5:45 AM, Ted Yu <
>> > > yuzhih...@gmail.com
>> > > > >
>> > > > >>> >> >>> wrote:
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>>>> +1
>> > > > >>> >> >>>>>>>> -------- Original message --------From: Stanislav
>> > > > Kozlovski <
>> > > > >>> >> >>>>>>>> stanis...@confluent.io> Date: 8/2/18  2:41 AM
>> > > (GMT-08:00)
>> > > > >>> To:
>> > > > >>> >> >>>>>>>> dev@kafka.apache.org Subject: [VOTE] KIP-334
>> Include
>> > > > >>> >> >> partitions
>> > > > >>> >> >>> in
>> > > > >>> >> >>>>>>>> exceptions raised during consumer record
>> > > > >>> >> >>> deserialization/validation
>> > > > >>> >> >>>>>>>> Hey everybody,
>> > > > >>> >> >>>>>>>>
>> > > > >>> >> >>>>>>>> I'd like to start a vote thread for KIP-334 Include
>> > > > >>> partitions
>> > > > >>> >> >> in
>> > > > >>> >> >>>>>>>> exceptions raised during consumer record
>> > > > >>> >> >>> deserialization/validation
>> > > > >>> >> >>>>>>>> <
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?
>> > > > >>> >> >>> pageId=87297793
>> > > > >>> >> >>>>>>>>>
>> > > > >>> >> >>>>>>>>
>> > > > >>> >> >>>>>>>> --
>> > > > >>> >> >>>>>>>> Best,
>> > > > >>> >> >>>>>>>> Stanislav
>> > > > >>> >> >>>>>>>>
>> > > > >>> >> >>>>>>>
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>>
>> > > > >>> >> >>>>>> --
>> > > > >>> >> >>>>>> Best,
>> > > > >>> >> >>>>>> Stanislav
>> > > > >>> >> >>>>>
>> > > > >>> >> >>>>
>> > > > >>> >> >>>>
>> > > > >>> >> >>>> --
>> > > > >>> >> >>>> Best,
>> > > > >>> >> >>>> Stanislav
>> > > > >>> >> >>>
>> > > > >>> >> >>
>> > > > >>> >> >
>> > > > >>> >> >
>> > > > >>> >>
>> > > > >>> >>
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > > >> --
>> > > > >> Best,
>> > > > >> Stanislav
>> > > > >>
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Best,
>> > > > > Stanislav
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Best,
>> > > > Stanislav
>> > > >
>> > >
>> >
>> >
>> > --
>> > Best,
>> > Stanislav
>>
>
>
> --
> Best,
> Stanislav
>


-- 

<https://www.confluent.io/>

*Sarwar Bhuiyan*

Senior Customer Success Architect

+447949684437

Follow us:  Blog <http://www.confluent.io/blog> • Slack
<https://slackpass.io/confluentcommunity> • Twitter
<https://twitter.com/ConfluentInc> • YouTube <https://youtube.com/confluent>

<https://developer.confluent.io/>

Re: [VOTE] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

Reply via email to