Yanquan Lv created FLINK-36682: ---------------------------------- Summary: Add split assign strategy to avoid OOM error in TaskManager Key: FLINK-36682 URL: https://issues.apache.org/jira/browse/FLINK-36682 Project: Flink Issue Type: Bug Components: Flink CDC Affects Versions: cdc-3.3.0 Reporter: Yanquan Lv Fix For: cdc-3.3.0
During snapshot reading phase, we will split table into chunks and assign them to split reader in TaskManager. For evenly chunk split, them are assigned in ascending order. For example, a table that primary key is id may be split into chunks like [-∞, 10000), [10000,20000), [20000,30000), ......[1500000, +∞). However, during snapshot reading phase, more records may be inserted and id will increase to relative high, and the last split may need to fetch too many records, for example, the last split may need to fetch records in range [1500000, 3000000], witch will cause TaskManager out of memory. So I propose to add a strategy to allow user to config how to assign split, and by default, we can send the last split first to split reader. -- This message was sent by Atlassian Jira (v8.20.10#820010)