One difference between the two cases is that we can't generally trust the
offset of a corrupt message. Which offset were we planning to use in the
exception? Maybe it should be either the fetch offset or one plus the last
consumed offset? I think I'm with Colin in preferring
RecordDeserializationException for both cases if possible. For one thing,
that makes the behavior consistent whether or not `check.crs` is enabled.

-Jason

On Tue, Aug 7, 2018 at 11:17 AM, Colin McCabe <cmcc...@apache.org> wrote:

> Hi Stanislav,
>
> On Sat, Aug 4, 2018, at 10:44, Stanislav Kozlovski wrote:
> > Hey Colin,
> >
> > It may be a bit vague but keep in mind we also raise the exception in the
> > case where the record is corrupted.
> > We discussed with Jason offline that message corruption most often
> prevents
> > deserialization itself and that may be enough of an argument to raise
> > `RecordDeserializationException` in the case of a corrupt record. I
> > personally find that misleading.
>
> Hmm.  I think that by definition, corrupt records are records that can't
> be deserialized.  There's no difference between the two cases -- if (and
> only if) the message is corrupt, it can't be deserialized.  If (and only
> if) it can't be deserialized, it is corrupt.
>
> >
> > In the end, I think it might be worth it to have a bit of a
> > wider-encompassing `FaultyRecordException` (or even
> > `UnconsumableRecordException`) which would allow users to skip over the
> > record.
>
> If you are inheriting from SerializationException, your derived class
> should also be a kind of serialization exception.  Not something more
> general.
>
> > We could then potentially have more specific exceptions (e.g
> > deserialization) inherit that but I'm not sure if we should.
> > A case for a more general exception which provides access to the
> > partition/offset is future backwards-compatibility. If there is ever a
> new
> > scenario that would make the user need to skip a specific record - there
> > would already be such an exception and this will provide some
> > backward-compatibility with older clients.
>
> Hmm.  Can you think of any new scenarios that would make Kafka force the
> user need to skip a specific record?  Perhaps one scenario is if records
> are lost but we don't know how many.
>
> If we really want to have something more general, perhaps we need
> something like LostDataException.  But in that case, we can't inherit from
> SerializationException, since lost data is more general than serialization
> issues.
>
> best,
> Colin
>
>
> >
> > Best,
> > Stanislav
> >
> > On Sat, Aug 4, 2018 at 12:23 AM Colin McCabe <cmcc...@apache.org> wrote:
> >
> > > Hi Stanislav,
> > >
> > > Thanks for the KIP.
> > >
> > > As as user, "FaultyRecordException" seems a little vague.  What's
> faulty
> > > about it?  Is it too big?  Does it not match the schema of the other
> > > records?  Was it not found?
> > >
> > > Of course, we know that the specific problem is with [de]serializing
> the
> > > record, not with any of those those things.  So we should choose a name
> > > that reflects that serialization is the problem.  How about
> > > RecordSerializationException?
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Thu, Aug 2, 2018, at 15:11, Stanislav Kozlovski wrote:
> > > > Hi Jason and Ted,
> > > >
> > > > @Jason
> > > > I did not oppose the idea as I'm not too familiar with Java
> conventions.
> > > I
> > > > agree it is a non-ideal way for the user to catch the exception so I
> > > > changed it back.
> > > >
> > > > I've updated the KIP and PR with the latest changes. Now, there is
> only
> > > one
> > > > public exception `FaultyRecordException` which is thrown to the user
> in
> > > > both scenarios (corrupt record and deserialization error).
> > > > Please take a look again.
> > > >
> > > > Best,
> > > > Stanislav
> > > >
> > > > On Thu, Aug 2, 2018 at 5:25 PM Jason Gustafson <ja...@confluent.io>
> > > wrote:
> > > >
> > > > > Hey Stanislav,
> > > > >
> > > > > I should have noticed it earlier from your example, but I just
> realized
> > > > > that interfaces don't mix well with exceptions. There is no way to
> > > catch
> > > > > the interface type, which means you have to depend on instanceof
> > > checks,
> > > > > which is not very conventional. At the moment, we raise
> > > > > SerializationException if there is a problem parsing the error,
> and we
> > > > > raise KafkaException if the record fails its CRC check. Since
> > > > > SerializationException extends KafkaExeption, it seems like we can
> > > handle
> > > > > both cases in a compatible way by handling both cases with a single
> > > type
> > > > > that extends SerializationException. Maybe something like
> > > > > RecordDeserializationException?
> > > > >
> > > > > Thanks,
> > > > > Jason
> > > > >
> > > > > On Thu, Aug 2, 2018 at 5:45 AM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > > >
> > > > > > +1
> > > > > > -------- Original message --------From: Stanislav Kozlovski <
> > > > > > stanis...@confluent.io> Date: 8/2/18  2:41 AM  (GMT-08:00) To:
> > > > > > dev@kafka.apache.org Subject: [VOTE] KIP-334 Include partitions
> in
> > > > > > exceptions raised during consumer record
> deserialization/validation
> > > > > > Hey everybody,
> > > > > >
> > > > > > I'd like to start a vote thread for KIP-334 Include partitions in
> > > > > > exceptions raised during consumer record
> deserialization/validation
> > > > > > <
> > > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?
> pageId=87297793
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Best,
> > > > > > Stanislav
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best,
> > > > Stanislav
> > >
> >
> >
> > --
> > Best,
> > Stanislav
>

Reply via email to