[ https://issues.apache.org/jira/browse/KAFKA-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jurgis Pods updated KAFKA-9744: ------------------------------- Description: _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely affects all versions. We recently changed the namespace of inner records in Avros schemas in the Confluent Schema Registry. Those changes were accepted as backwards-compatible. However, when redeploying Kafka S3 connectors consuming the relevant topics, we received error from SchemaProjector.project(), causing the connectors to crash und stop producing data: {code:java} org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. source name: my.example.Record and target name: my.example.sub.Record {code} A change of a record's namespace is compatible according to the Schema Registry (which internally uses a check from the Avro library), but not for the Connect API. I would argue that the namespace/package name should not affect compatibility, as it says nothing about the contained data and its schema. Would it be possible to have a more consistent (and less restrictive) check in Connect API, so that a namespace change in the producer can be made more confidently, without fear of breaking the consuming connectors? If, on the other hand, you want to be strict about namespaces, then at least the schema registry should report a namespace change as incompatible. was: _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely affects all versions. We recently made a number of backwards-compatible changes to our Avro schemas in the Confluent Schema Registry. Those changes were accepted as backwards-compatible. However, when redeploying Kafka S3 connectors consuming the relevant topics, we noticed two separate instances of failures in SchemaProjector.project(), causing the connectors to crash und stop producing data: 1) Changed namespace of record: {code:java} org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. source name: my.example.Record and target name: my.example.sub.Record {code} A change of a record's namespace is compatible according to the Schema Registry, but not for the Connect API. I would argue that the namespace/package name should not affect compatibility, as it says nothing about the contained data and its schema. 2) Change of type from 1-element union to primitive field: {code:java} Schema type mismatch. source type: STRUCT and target type: STRING {code} This happened when changing the corresponding field's Avro schema from {code:java} name": "myfield", "type": ["string"] {code} to {code:java} name": "myfield", "type": "string"{code} In this case, I am less convinced that those two schemas should be compatible (they are semantically identical - however, a Union is not a String). But it is unfortunate that the Schema Registry sees the above change as compatible, while the Connect API does not. *Summary*: We made two Avro schema changes which were accepted as compatible by the Schema Registry, but were rejected at runtime by the Kafka S3 connectors. Would it be possible to have a more consistent (and less restrictive) check in Connect API, so that a schema change in the producer can be made more confidently, without fear of breaking the consuming connectors? > SchemaProjector fails to handle backwards-compatible schema change > ------------------------------------------------------------------ > > Key: KAFKA-9744 > URL: https://issues.apache.org/jira/browse/KAFKA-9744 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect > Affects Versions: 2.3.1 > Reporter: Jurgis Pods > Priority: Major > > _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely > affects all versions. > We recently changed the namespace of inner records in Avros schemas in the > Confluent Schema Registry. Those changes were accepted as > backwards-compatible. However, when redeploying Kafka S3 connectors consuming > the relevant topics, we received error from SchemaProjector.project(), > causing the connectors to crash und stop producing data: > {code:java} > org.apache.kafka.connect.errors.SchemaProjectorException: Schema name > mismatch. source name: my.example.Record and target name: > my.example.sub.Record {code} > A change of a record's namespace is compatible according to the Schema > Registry (which internally uses a check from the Avro library), but not for > the Connect API. I would argue that the namespace/package name should not > affect compatibility, as it says nothing about the contained data and its > schema. > Would it be possible to have a more consistent (and less restrictive) check > in Connect API, so that a namespace change in the producer can be made more > confidently, without fear of breaking the consuming connectors? If, on the > other hand, you want to be strict about namespaces, then at least the schema > registry should report a namespace change as incompatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)