One difference between the two cases is that we can't generally trust the offset of a corrupt message. Which offset were we planning to use in the exception? Maybe it should be either the fetch offset or one plus the last consumed offset? I think I'm with Colin in preferring RecordDeserializationException for both cases if possible. For one thing, that makes the behavior consistent whether or not `check.crs` is enabled.
-Jason On Tue, Aug 7, 2018 at 11:17 AM, Colin McCabe <cmcc...@apache.org> wrote: > Hi Stanislav, > > On Sat, Aug 4, 2018, at 10:44, Stanislav Kozlovski wrote: > > Hey Colin, > > > > It may be a bit vague but keep in mind we also raise the exception in the > > case where the record is corrupted. > > We discussed with Jason offline that message corruption most often > prevents > > deserialization itself and that may be enough of an argument to raise > > `RecordDeserializationException` in the case of a corrupt record. I > > personally find that misleading. > > Hmm. I think that by definition, corrupt records are records that can't > be deserialized. There's no difference between the two cases -- if (and > only if) the message is corrupt, it can't be deserialized. If (and only > if) it can't be deserialized, it is corrupt. > > > > > In the end, I think it might be worth it to have a bit of a > > wider-encompassing `FaultyRecordException` (or even > > `UnconsumableRecordException`) which would allow users to skip over the > > record. > > If you are inheriting from SerializationException, your derived class > should also be a kind of serialization exception. Not something more > general. > > > We could then potentially have more specific exceptions (e.g > > deserialization) inherit that but I'm not sure if we should. > > A case for a more general exception which provides access to the > > partition/offset is future backwards-compatibility. If there is ever a > new > > scenario that would make the user need to skip a specific record - there > > would already be such an exception and this will provide some > > backward-compatibility with older clients. > > Hmm. Can you think of any new scenarios that would make Kafka force the > user need to skip a specific record? Perhaps one scenario is if records > are lost but we don't know how many. > > If we really want to have something more general, perhaps we need > something like LostDataException. But in that case, we can't inherit from > SerializationException, since lost data is more general than serialization > issues. > > best, > Colin > > > > > > Best, > > Stanislav > > > > On Sat, Aug 4, 2018 at 12:23 AM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi Stanislav, > > > > > > Thanks for the KIP. > > > > > > As as user, "FaultyRecordException" seems a little vague. What's > faulty > > > about it? Is it too big? Does it not match the schema of the other > > > records? Was it not found? > > > > > > Of course, we know that the specific problem is with [de]serializing > the > > > record, not with any of those those things. So we should choose a name > > > that reflects that serialization is the problem. How about > > > RecordSerializationException? > > > > > > best, > > > Colin > > > > > > > > > On Thu, Aug 2, 2018, at 15:11, Stanislav Kozlovski wrote: > > > > Hi Jason and Ted, > > > > > > > > @Jason > > > > I did not oppose the idea as I'm not too familiar with Java > conventions. > > > I > > > > agree it is a non-ideal way for the user to catch the exception so I > > > > changed it back. > > > > > > > > I've updated the KIP and PR with the latest changes. Now, there is > only > > > one > > > > public exception `FaultyRecordException` which is thrown to the user > in > > > > both scenarios (corrupt record and deserialization error). > > > > Please take a look again. > > > > > > > > Best, > > > > Stanislav > > > > > > > > On Thu, Aug 2, 2018 at 5:25 PM Jason Gustafson <ja...@confluent.io> > > > wrote: > > > > > > > > > Hey Stanislav, > > > > > > > > > > I should have noticed it earlier from your example, but I just > realized > > > > > that interfaces don't mix well with exceptions. There is no way to > > > catch > > > > > the interface type, which means you have to depend on instanceof > > > checks, > > > > > which is not very conventional. At the moment, we raise > > > > > SerializationException if there is a problem parsing the error, > and we > > > > > raise KafkaException if the record fails its CRC check. Since > > > > > SerializationException extends KafkaExeption, it seems like we can > > > handle > > > > > both cases in a compatible way by handling both cases with a single > > > type > > > > > that extends SerializationException. Maybe something like > > > > > RecordDeserializationException? > > > > > > > > > > Thanks, > > > > > Jason > > > > > > > > > > On Thu, Aug 2, 2018 at 5:45 AM, Ted Yu <yuzhih...@gmail.com> > wrote: > > > > > > > > > > > +1 > > > > > > -------- Original message --------From: Stanislav Kozlovski < > > > > > > stanis...@confluent.io> Date: 8/2/18 2:41 AM (GMT-08:00) To: > > > > > > dev@kafka.apache.org Subject: [VOTE] KIP-334 Include partitions > in > > > > > > exceptions raised during consumer record > deserialization/validation > > > > > > Hey everybody, > > > > > > > > > > > > I'd like to start a vote thread for KIP-334 Include partitions in > > > > > > exceptions raised during consumer record > deserialization/validation > > > > > > < > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action? > pageId=87297793 > > > > > > > > > > > > > > > > > > > -- > > > > > > Best, > > > > > > Stanislav > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best, > > > > Stanislav > > > > > > > > > -- > > Best, > > Stanislav >