kevinwangcs opened a new pull request, #3548:
URL: https://github.com/apache/flink-cdc/pull/3548

   **Problem Description:**
   
   When adding a new table, the Flink CDC MySQL source connector experiences 
missing data for some columns of the newly added table.
   
   **Reproduction Scenario:**
   
   1. Remove a table from a cdc job that is running normally, then start the 
job with resume functionality.
   2. Perform a column addition operation on the removed table.
   3. Add the table back to the job. The job continues to run without 
interruption upon table addition, but data for the newly added columns is 
missing in the synchronized data.
   
   **Cause Analysis:**
   
   The issue arises because the MySQL CDC Source maintains the table schema in 
state. When adding a new table, it recovers the schema from the previous state. 
Since the prior schema exists and represents the structure before the column 
addition, the MySQL CDC Source provides the downstream with data based on the 
schema cached in the state. Consequently, records outputted to downstream 
systems are missing the fields corresponding to the newly added columns.
   
   **Proposed Solution:**
   
   Upon removing a table from the cdc job, it is necessary to also 
correspondingly remove the table from the MySQLBinlogSplit.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to