Hey there Kamal,

I'm sincerely sorry for missing your earlier message. As I open this thread
up, I see I have an unsent draft message about resuming discussion from
some time ago.

In retrospect, I think I may have been too pedantic with the exception
naming and hierarchy.
I now believe a single exception type of `RecordDeserializationException`
is enough. Let's go with that.

On Mon, May 6, 2019 at 6:40 AM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Matthias,
>
> We already have CorruptRecordException which doesn't extend the
> SerializationException. So, we need an alternate
> name suggestion for the corrupted record error if we decide to extend the
> FaultyRecordException class.
>
> Stanislav,
>
> Our users are also facing this error. Could we bump up this discussion?
>
> I think we can have a single exception type
> FaultyRecordException/RecordDeserialization exception to capture both
> the errors. We can add an additional enum field to differentiate the errors
> if required.
>
> Thanks,
> Kamal Chandraprakash
>
> On Wed, Apr 24, 2019 at 1:49 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Stanislav,
> >
> > Any updates on this KIP? We have internal users who want to skip the
> > corrupted message while consuming the records.
> >
> >
> > On Fri, Oct 19, 2018 at 11:34 PM Matthias J. Sax <matth...@confluent.io>
> > wrote:
> >
> >> I am not 100% familiar with the details of the consumer code, however I
> >> tend to disagree with:
> >>
> >> > There's no difference between the two cases -- if (and only if) the
> >> message is corrupt, it can't be deserialized.  If (and only if) it
> can't be
> >> deserialized, it is corrupt.
> >>
> >> Assume that a user configures a JSON deserializer but a faulty upstream
> >> producer writes an Avro message. For this case, the message is not
> >> corrupted, but still can't be deserialized. And I can imaging that users
> >> want to handle both cases differently.
> >>
> >> Thus, I think it makes sense to have two different exceptions
> >> `RecordDeserializationException` and `CorruptedRecordException` that can
> >> both extend `FaultyRecordException` (don't like this name too much
> >> honestly, but don't have a better idea for it anyway).
> >>
> >> Side remark. If we introduce class `RecordDeserializationException` and
> >> `CorruptedRecordException`, we can also add an interface that both
> >> implement to return partition/offset information and let both extend
> >> `SerializationException` directly without an intermediate class in the
> >> exception hierarchy.
> >>
> >>
> >> -Matthias
> >>
> >> On 8/8/18 2:57 AM, Stanislav Kozlovski wrote:
> >> >> If you are inheriting from SerializationException, your derived class
> >> > should also be a kind of serialization exception.  Not something more
> >> > general.
> >> > Yeah, the reason for inheriting it would be for
> backwards-compatibility.
> >> >
> >> >> Hmm.  Can you think of any new scenarios that would make Kafka force
> >> the
> >> > user need to skip a specific record?  Perhaps one scenario is if
> records
> >> > are lost but we don't know how many.
> >> > Not on the spot, but I do wonder how likely a new scenario is to
> >> surface in
> >> > the future and how we'd handle the exceptions' class hierarchy then.
> >> >
> >> >> Which offset were we planning to use in the
> >> > exception?
> >> > The offset of the record which caused the exception. In the case of
> >> > batches, we use the last offset of the batch. In both cases, users
> >> should
> >> > have to seek +1 from the given offset. You can review the PR to ensure
> >> its
> >> > accurate
> >> >
> >> >
> >> > If both of you prefer `RecordDeserializationException`, we can go with
> >> > that. Please do confirm that is okay
> >> >
> >> > On Tue, Aug 7, 2018 at 11:35 PM Jason Gustafson <ja...@confluent.io>
> >> wrote:
> >> >
> >> >> One difference between the two cases is that we can't generally trust
> >> the
> >> >> offset of a corrupt message. Which offset were we planning to use in
> >> the
> >> >> exception? Maybe it should be either the fetch offset or one plus the
> >> last
> >> >> consumed offset? I think I'm with Colin in preferring
> >> >> RecordDeserializationException for both cases if possible. For one
> >> thing,
> >> >> that makes the behavior consistent whether or not `check.crs` is
> >> enabled.
> >> >>
> >> >> -Jason
> >> >>
> >> >> On Tue, Aug 7, 2018 at 11:17 AM, Colin McCabe <cmcc...@apache.org>
> >> wrote:
> >> >>
> >> >>> Hi Stanislav,
> >> >>>
> >> >>> On Sat, Aug 4, 2018, at 10:44, Stanislav Kozlovski wrote:
> >> >>>> Hey Colin,
> >> >>>>
> >> >>>> It may be a bit vague but keep in mind we also raise the exception
> in
> >> >> the
> >> >>>> case where the record is corrupted.
> >> >>>> We discussed with Jason offline that message corruption most often
> >> >>> prevents
> >> >>>> deserialization itself and that may be enough of an argument to
> raise
> >> >>>> `RecordDeserializationException` in the case of a corrupt record. I
> >> >>>> personally find that misleading.
> >> >>>
> >> >>> Hmm.  I think that by definition, corrupt records are records that
> >> can't
> >> >>> be deserialized.  There's no difference between the two cases -- if
> >> (and
> >> >>> only if) the message is corrupt, it can't be deserialized.  If (and
> >> only
> >> >>> if) it can't be deserialized, it is corrupt.
> >> >>>
> >> >>>>
> >> >>>> In the end, I think it might be worth it to have a bit of a
> >> >>>> wider-encompassing `FaultyRecordException` (or even
> >> >>>> `UnconsumableRecordException`) which would allow users to skip over
> >> the
> >> >>>> record.
> >> >>>
> >> >>> If you are inheriting from SerializationException, your derived
> class
> >> >>> should also be a kind of serialization exception.  Not something
> more
> >> >>> general.
> >> >>>
> >> >>>> We could then potentially have more specific exceptions (e.g
> >> >>>> deserialization) inherit that but I'm not sure if we should.
> >> >>>> A case for a more general exception which provides access to the
> >> >>>> partition/offset is future backwards-compatibility. If there is
> ever
> >> a
> >> >>> new
> >> >>>> scenario that would make the user need to skip a specific record -
> >> >> there
> >> >>>> would already be such an exception and this will provide some
> >> >>>> backward-compatibility with older clients.
> >> >>>
> >> >>> Hmm.  Can you think of any new scenarios that would make Kafka force
> >> the
> >> >>> user need to skip a specific record?  Perhaps one scenario is if
> >> records
> >> >>> are lost but we don't know how many.
> >> >>>
> >> >>> If we really want to have something more general, perhaps we need
> >> >>> something like LostDataException.  But in that case, we can't
> inherit
> >> >> from
> >> >>> SerializationException, since lost data is more general than
> >> >> serialization
> >> >>> issues.
> >> >>>
> >> >>> best,
> >> >>> Colin
> >> >>>
> >> >>>
> >> >>>>
> >> >>>> Best,
> >> >>>> Stanislav
> >> >>>>
> >> >>>> On Sat, Aug 4, 2018 at 12:23 AM Colin McCabe <cmcc...@apache.org>
> >> >> wrote:
> >> >>>>
> >> >>>>> Hi Stanislav,
> >> >>>>>
> >> >>>>> Thanks for the KIP.
> >> >>>>>
> >> >>>>> As as user, "FaultyRecordException" seems a little vague.  What's
> >> >>> faulty
> >> >>>>> about it?  Is it too big?  Does it not match the schema of the
> other
> >> >>>>> records?  Was it not found?
> >> >>>>>
> >> >>>>> Of course, we know that the specific problem is with
> [de]serializing
> >> >>> the
> >> >>>>> record, not with any of those those things.  So we should choose a
> >> >> name
> >> >>>>> that reflects that serialization is the problem.  How about
> >> >>>>> RecordSerializationException?
> >> >>>>>
> >> >>>>> best,
> >> >>>>> Colin
> >> >>>>>
> >> >>>>>
> >> >>>>> On Thu, Aug 2, 2018, at 15:11, Stanislav Kozlovski wrote:
> >> >>>>>> Hi Jason and Ted,
> >> >>>>>>
> >> >>>>>> @Jason
> >> >>>>>> I did not oppose the idea as I'm not too familiar with Java
> >> >>> conventions.
> >> >>>>> I
> >> >>>>>> agree it is a non-ideal way for the user to catch the exception
> so
> >> >> I
> >> >>>>>> changed it back.
> >> >>>>>>
> >> >>>>>> I've updated the KIP and PR with the latest changes. Now, there
> is
> >> >>> only
> >> >>>>> one
> >> >>>>>> public exception `FaultyRecordException` which is thrown to the
> >> >> user
> >> >>> in
> >> >>>>>> both scenarios (corrupt record and deserialization error).
> >> >>>>>> Please take a look again.
> >> >>>>>>
> >> >>>>>> Best,
> >> >>>>>> Stanislav
> >> >>>>>>
> >> >>>>>> On Thu, Aug 2, 2018 at 5:25 PM Jason Gustafson <
> ja...@confluent.io
> >> >>>
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>>> Hey Stanislav,
> >> >>>>>>>
> >> >>>>>>> I should have noticed it earlier from your example, but I just
> >> >>> realized
> >> >>>>>>> that interfaces don't mix well with exceptions. There is no way
> >> >> to
> >> >>>>> catch
> >> >>>>>>> the interface type, which means you have to depend on instanceof
> >> >>>>> checks,
> >> >>>>>>> which is not very conventional. At the moment, we raise
> >> >>>>>>> SerializationException if there is a problem parsing the error,
> >> >>> and we
> >> >>>>>>> raise KafkaException if the record fails its CRC check. Since
> >> >>>>>>> SerializationException extends KafkaExeption, it seems like we
> >> >> can
> >> >>>>> handle
> >> >>>>>>> both cases in a compatible way by handling both cases with a
> >> >> single
> >> >>>>> type
> >> >>>>>>> that extends SerializationException. Maybe something like
> >> >>>>>>> RecordDeserializationException?
> >> >>>>>>>
> >> >>>>>>> Thanks,
> >> >>>>>>> Jason
> >> >>>>>>>
> >> >>>>>>> On Thu, Aug 2, 2018 at 5:45 AM, Ted Yu <yuzhih...@gmail.com>
> >> >>> wrote:
> >> >>>>>>>
> >> >>>>>>>> +1
> >> >>>>>>>> -------- Original message --------From: Stanislav Kozlovski <
> >> >>>>>>>> stanis...@confluent.io> Date: 8/2/18  2:41 AM  (GMT-08:00) To:
> >> >>>>>>>> dev@kafka.apache.org Subject: [VOTE] KIP-334 Include
> >> >> partitions
> >> >>> in
> >> >>>>>>>> exceptions raised during consumer record
> >> >>> deserialization/validation
> >> >>>>>>>> Hey everybody,
> >> >>>>>>>>
> >> >>>>>>>> I'd like to start a vote thread for KIP-334 Include partitions
> >> >> in
> >> >>>>>>>> exceptions raised during consumer record
> >> >>> deserialization/validation
> >> >>>>>>>> <
> >> >>>>>>>
> >> >>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?
> >> >>> pageId=87297793
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> --
> >> >>>>>>>> Best,
> >> >>>>>>>> Stanislav
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> Best,
> >> >>>>>> Stanislav
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> Best,
> >> >>>> Stanislav
> >> >>>
> >> >>
> >> >
> >> >
> >>
> >>
>


-- 
Best,
Stanislav

Reply via email to