zhangdingxin created FLINK-35237: ------------------------------------ Summary: Allow Custom HashFunction in PrePartitionOperator for Flink Sink Customization Key: FLINK-35237 URL: https://issues.apache.org/jira/browse/FLINK-35237 Project: Flink Issue Type: Improvement Components: Flink CDC Reporter: zhangdingxin
The {{PrePartitionOperator}} in its current implementation only supports a fixed {{HashFunction}} ({{{}org.apache.flink.cdc.runtime.partitioning.PrePartitionOperator.HashFunction{}}}). This limits the ability of Sink implementations to customize the partitioning logic for {{{}DataChangeEvent{}}}s. For example, in the case of partitioned tables, it would be advantageous to allow hashing based on partition keys, hashing according to table names, or using the database engine's internal primary key hash functions (such as with MaxCompute DataSink). When users require such custom partitioning logic, they are compelled to implement their PartitionOperator, which undermines the utility of {{{}PrePartitionOperator{}}}. To address this limitation, it would be highly desirable to enable the {{PrePartitionOperator}} to support user-specified custom {{{}HashFunction{}}}s (Function<DataChangeEvent, Integer>). A possible solution could involve a mechanism analogous to the {{DataSink}} interface, allowing the specification of a {{HashFunctionFactory}} class path in the configuration file. This enhancement would greatly facilitate users in tailoring partition strategies to meet their specific application needs. -- This message was sent by Atlassian Jira (v8.20.10#820010)