Hi we notice data loss i.e. dropped records when running Debezium on Kafka Connect with Apicurio Schema Registry. Specifically, multiple times we have observed that a single record is dropped when we get this exception (full stack trace <https://gist.github.com/twthorn/917bf3cc576f2b486dde04b16a60d681>).
Failed to send HTTP request to endpoint: http://schema-registry.service.prod-us-east-1-dw1.consul:8080/apis/ccompat/v6/subjects/prod .<keyspace>.<table>-key/versions?normalize=false This exception is raised by the Kafka Connect worker, which receives it from the confluent schema registry client. This seems to be a network blip and after it doesn't have any errors and continues processing data without issue. But it will have data loss for one record that was received almost exactly one minute prior to when this exception is logged. We have observed the behavior with that same timeline occur on different days several weeks apart. We have these key Kafka config settings (see full configs here <https://gist.github.com/twthorn/78c2ac329a46ce1baa820753daad47dd>): "producer.batch.size=524288" "producer.linger.ms=100" "producer.acks=-1" "producer.compression.type=snappy" "producer.buffer.memory=268435456" "config.storage.replication.factor=4" "offset.storage.replication.factor=4" "status.storage.replication.factor=4" "scheduled.rebalance.max.delay.ms=180000" Other version info: - Kafka Version 3.8.1 - Confluent version (eg for kafka-schema-registry-client, kafka-schema-registry-converter, etc ) 7.5.2 - Avro Version 1.11.4 Questions we have: - Are there any known issues with schema registry interacting with Kafka Connect to cause data loss? - If we drop a record does that mean that the offsets stored by the Kafka Connect worker source task are incorrect? i.e., we are committing offsets for data that we have not yet finished sending to Kafka - Any recommended debug steps to root cause this issue? Thank you for the help