[ https://issues.apache.org/jira/browse/FLINK-36682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ruan Hang updated FLINK-36682: ------------------------------ Fix Version/s: cdc-3.3.1 (was: cdc-3.3.0) > Add split assign strategy to avoid OOM error in TaskManager > ----------------------------------------------------------- > > Key: FLINK-36682 > URL: https://issues.apache.org/jira/browse/FLINK-36682 > Project: Flink > Issue Type: Bug > Components: Flink CDC > Affects Versions: cdc-3.3.0 > Reporter: Yanquan Lv > Priority: Major > Labels: pull-request-available > Fix For: cdc-3.3.1 > > > During snapshot reading phase, we will split table into chunks and assign > them to split reader in TaskManager. > For evenly chunk split, them are assigned in ascending order. For example, a > table that primary key is id may be split into chunks like [-∞, 10000), > [10000,20000), [20000,30000), ......[1500000, +∞). However, during snapshot > reading phase, more records may be inserted and id will increase to relative > high, and the last split may need to fetch too many records, for example, the > last split may need to fetch records in range [1500000, 3000000], witch will > cause TaskManager out of memory. > So I propose to add a strategy to allow user to config how to assign split, > and by default, we can send the last split first to split reader. -- This message was sent by Atlassian Jira (v8.20.10#820010)