Yohei Yoshimuta created FLINK-36945:
---------------------------------------

             Summary: MySQL CDC internal schema representation becomes out of 
sync with the real database schema when restarting a job
                 Key: FLINK-36945
                 URL: https://issues.apache.org/jira/browse/FLINK-36945
             Project: Flink
          Issue Type: Bug
          Components: Flink CDC
            Reporter: Yohei Yoshimuta


[The Vitess schema migration 
tool|https://vitess.io/docs/user-guides/schema-changes/ddl-strategies/] uses 
`RENAME TABLE` to perform schema changes.

However, the MySQL CDC connector does not account for these changes, causing 
the schema history topic in Debezium to become stale. While this issue does not 
immediately affect a running job, it prevents the job from restarting 
successfully from a checkpoint and results in the following error:

 

```

Caused by: 
com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.errors.ConnectException:
 Data row is smaller than a column index, internal schema representation is 
probably out of sync with real database schema

at 
io.debezium.relational.TableSchemaBuilder.validateIncomingRowToInternalMetadata(TableSchemaBuilder.java:254)

at 
io.debezium.relational.TableSchemaBuilder.lambda$createValueGenerator$5(TableSchemaBuilder.java:283)

at io.debezium.relational.TableSchema.valueFromColumnData(TableSchema.java:141)

at 
io.debezium.relational.RelationalChangeRecordEmitter.emitUpdateRecord(RelationalChangeRecordEmitter.java:139)

at 
io.debezium.relational.RelationalChangeRecordEmitter.emitChangeRecords(RelationalChangeRecordEmitter.java:60)

at 
io.debezium.pipeline.EventDispatcher.dispatchDataChangeEvent(EventDispatcher.java:209)

... 12 more

```

 

When this happens, the database history topic needs to be rebuilt, but the job 
cannot be automatically recovered.

 

The current workaround is to set `scan.startup.mode` to `specific-offset`, 
which forces CDC to pass `schema_only_recovery` to Debezium. However, this 
requires manual intervention.

 

Examples of potential solutions include:

- Enhancing the schema history mechanism to capture and process `RENAME TABLE` 
events during schema migrations.

- Implementing a fallback mechanism to reconcile schema differences during 
recovery, reducing the need for manual intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to