zhangdingxin created FLINK-35237:
------------------------------------

             Summary: Allow Custom HashFunction in PrePartitionOperator for 
Flink Sink Customization
                 Key: FLINK-35237
                 URL: https://issues.apache.org/jira/browse/FLINK-35237
             Project: Flink
          Issue Type: Improvement
          Components: Flink CDC
            Reporter: zhangdingxin


The {{PrePartitionOperator}} in its current implementation only supports a 
fixed {{HashFunction}} 
({{{}org.apache.flink.cdc.runtime.partitioning.PrePartitionOperator.HashFunction{}}}).
 This limits the ability of Sink implementations to customize the partitioning 
logic for {{{}DataChangeEvent{}}}s. For example, in the case of partitioned 
tables, it would be advantageous to allow hashing based on partition keys, 
hashing according to table names, or using the database engine's internal 
primary key hash functions (such as with MaxCompute DataSink).

When users require such custom partitioning logic, they are compelled to 
implement their PartitionOperator, which undermines the utility of 
{{{}PrePartitionOperator{}}}.

To address this limitation, it would be highly desirable to enable the 
{{PrePartitionOperator}} to support user-specified custom {{{}HashFunction{}}}s 
(Function<DataChangeEvent, Integer>). A possible solution could involve a 
mechanism analogous to the {{DataSink}} interface, allowing the specification 
of a {{HashFunctionFactory}} class path in the configuration file. This 
enhancement would greatly facilitate users in tailoring partition strategies to 
meet their specific application needs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to