Hi Chris, Thanks a lot for your comments.
The complexity comes from maintaining an additional topic and a connector, rather than configuring them. Users need to spend extra time and money to maintain the additional connectors. I can imagine a case where a user has 3 topics consumed by S3, HDFS and JDBC respectively The user has to maintain 3 more connectors to consume three DLQs, in order to put broken records to the place they should go. This new option will give users a choice to only maintain half of their connectors, yet having broken records stored in each destination system. This is a great question. I updated my KIP to reflect the most recent plan. We can add a new method to SinkTask called “putBrokenRecord”, so that sink connectors is able to differentiate between well-formed records and broken records. The default implementation of this method should be throwing errors to indicate that the connector does not support broken record handling yet. I think the Schema should be Optional Byte Array, in order to handle all possibilities. But I’m open to suggestions on that. Yes, this rejected alternative plan makes sense to me. I’ll put that into the KIP. Compared with this alternative, the point of this proposal is to save the effort to maintain twice as many connectors as necessary. Thanks again. Looking forward to the discussion! Best, Zihan On 2020/04/13 22:35:56, Christopher Egerton <c...@confluent.io> wrote: > HI Zihan,> > > Thanks for the KIP! I have some questions that I'm hoping we can address to> > help better understand the motivation for this proposal.> > > 1. In the "Motivation" section it's written that "If users want to store> > their broken records, they have to config a broken record queue, which is> > too much work for them in some cases." Could you elaborate on what makes> > this a lot of work? Ideally, users should be able to configure the dead> > letter queue by specifying a value for the "> > errors.deadletterqueue.topic.name" property in their sink connector config;> > this doesn't seem like a lot of work on the surface.> > > 2. If the "errors.tolerance" property is set to "continue", would sink> > connectors be able to differentiate between well-formed records whose> > successfully-deserialized contents are byte arrays and malformed records> > whose contents are the still-serialized byte arrays of the Kafka message> > from which they came?> > > 3. I think it's somewhat implied by the KIP, but it'd be nice to see what> > the schema for a malformed record would be. Null? Byte array? Optional byte> > array?> > > 4. This is somewhat covered by the first question, but it seems worth> > pointing out that this exact functionality can already be achieved by using> > features already provided by the framework. Configure your connector to> > send malformed records to a dead letter queue topic, and configure a> > separate connector to consume from that dead letter queue topic, use the> > ByteArrayConverter to deserialize records, and send those records to the> > destination sink. It'd be nice if this were called out in the "Rejected> > Alternatives" section with a reason on why the changes proposed in the KIP> > are preferable, especially since it may still work as a viable workaround> > for users who are working on older versions of the Connect framework.> > > Looking forward to the discussion!> > > Cheers,> > > Chris> > > On Tue, Mar 24, 2020 at 11:50 AM Zihan Li <li...@umich.edu> wrote:> > > > Hi,> > >> > > I just want to re-up this discussion thread about KIP-582 Add a "continue"> > > option for Kafka Connect error handling.> > >> > > Wiki page: https://cwiki.apache.org/confluence/x/XRvcC <> > > https://cwiki.apache.org/confluence/x/XRvcC>> > >> > > JIRA: https://issues.apache.org/jira/browse/KAFKA-9740 <> > > https://issues.apache.org/jira/browse/KAFKA-9740>> > >> > > Please share your thoughts about adding this new error handling option to> > > Kafka Connect.> > >> > > Best,> > > Zihan> > >> > > > On Mar 18, 2020, at 12:55 PM, Zihan Li <li...@umich.edu> wrote:> > > >> > > > Hi all,> > > >> > > > I'd like to use this thread to discuss KIP-582 Add a "continue" option> > > for Kafka Connect error handling, please see detail at:> > > > https://cwiki.apache.org/confluence/x/XRvcC <> > > https://cwiki.apache.org/confluence/x/XRvcC>> > > >> > > > Best,> > > > Zihan Li> > >> > >> >