FangYongs opened a new pull request, #522: URL: https://github.com/apache/flink-table-store/pull/522
Currently sink operator in flink will shuffle data by bucket id, which cause data skew when there is only 1 bucket with multiple partitions in the table. This PR aims to support shuffling data by bucket id and partition when `sink.shuffle-by-partition.enable` is set. The main changes are 1. Added config `sink.shuffle-by-partition.enable` to support shuffling data by partition 2. Added `PartitionComputer` to get partition from row data 3. Added shuffling data by partition in `BucketStreamPartitioner` The main tests are 1. Added `FileStoreShuffleBucketTest` to shuffle data by bucket and partition -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org