zhaoruifeng01 opened a new pull request, #4338: URL: https://github.com/apache/flink-cdc/pull/4338
This closes [https://issues.apache.org/jira/browse/FLINK-39321](url) ## What is the purpose of the change This pull request fixes a bug where Flink CDC fails to write to Paimon dynamic bucket tables when dynamic-bucket.initial-buckets=1 and Flink parallelism > 1. The issue is caused by a parameter mismatch between the routing calculation in PaimonHashFunction (via RowAssignerChannelComputer ) and the assigner validation in BucketAssignOperator (via HashBucketAssigner ). When numAssigners=1 , all data should route to assigner 0, but BucketAssignOperator creates HashBucketAssigner instances with different assignId values for each subtask, causing validation failures. ## Brief change log - Modified BucketAssignOperator.java to set assignId=0 and numChannels=1 when minAssigners=1 - This ensures that all subtasks have consistent assigner parameters when numAssigners=1 - Aligns with the design principle that all data should route to assigner 0 when numAssigners=1 ## Verifying this change Manually verified the change by: 1. Creating a Paimon table with dynamic bucket mode ( bucket=-1 ) and dynamic-bucket.initial-buckets=1 2. Configuring Flink CDC pipeline with parallelism=4 3. Starting the Flink CDC job to write data to the Paimon table 4. Verifying that data is written successfully without the IllegalArgumentException 5. Verifying that all data routes to subtask 0 when numAssigners=1 ## Does this pull request potentially affect one of the following parts - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with @Public(Evolving) : no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
