Miguel Costa created KAFKA-14699:
------------------------------------

             Summary: Kafka Connect framework errors.tolerance improvement 
                 Key: KAFKA-14699
                 URL: https://issues.apache.org/jira/browse/KAFKA-14699
             Project: Kafka
          Issue Type: Improvement
          Components: KafkaConnect
    Affects Versions: 3.3.2
            Reporter: Miguel Costa


TL/DR improve errors tolerance from [none, all] to, [none, deserialization, 
transformation, put, all]

 

Hi all, It's my first time requesting an improvement here so, sorry if my 
request it not clear, if it's already been disregarded or if it's incomplete.

I am currently experiencing some issues with the Kafka Connect error handling 
and the DLQ setup that maybe is just my setup that is wrong or my understanding 
of it is wrong, that makes me assume that the current options provided by Kafka 
are insufficient.

I start by the current assumptions:

[https://kafka.apache.org/documentation/#sourceconnectorconfigs_errors.tolerance]
h4. 
[errors.tolerance|https://kafka.apache.org/documentation/#sourceconnectorconfigs_errors.tolerance]

Behavior for tolerating errors during connector operation. 'none' is the 
default value and signals that any error will result in an immediate connector 
task failure; 'all' changes the behavior to skip over problematic records.
||Type:|string|
||Default:|none|
||Valid Values:|[none, all]|
||Importance:|medium|

 

My understanding is that currently Kafka Connect framework allows you to either 
handle all errors as something ok or not, and leaves any further handling to 
the different plugin implementations of the Kafka Connectors themselves.

My experience is mainly in the Kafka Sink connectors.

What I've experience recently is something that is also reported here as a 
possible improvement on the individual connectors themselves.

[https://github.com/confluentinc/kafka-connect-jdbc/issues/721]

What I think is that Kafka Connect framework could provide an option to allow 
to better set the scenarios when we want to have records in the DLQ or when we 
want to have the connectors fail.

In my opinion failures in deserialization (Key, Header, Value Converters) or in 
the Transformation chain, are good errors to be candidates to go to the DLQ.

Errors when on Sink/Put are errors that should never be in the DLQ and instead 
should make the connectors fail, because this errors are not (or may not be) 
transient.

Trying to better explain, if I have a connectivity issue, or a table space 
issue, it makes no sense to try to move to next records and send all the 
records to the DLQ because until the target is up and running smoothly there 
would be no way to continue processing data.

I can imagine in a JDBC scenario and for example constraint violations that 
this would only happen to some records that we still would like them in the DLQ 
instead of failing the full pipeline, that why I think a configuration for 
"put" stage should also exist.

Let me know if this is clear, and if any of my understanding is completely 
wrong.

Best regards,

Miguel

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to