[ 
https://issues.apache.org/jira/browse/FLINK-37120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JunboWang updated FLINK-37120:
------------------------------
    Description: 
When synchronizing a large table, the ending chunk is always executed last, and 
splitEnd is null. This causing the task to scan too much data and eventually 
TaskManager OOM.

 

Related log:
{code:java}
// code placeholder
2025-01-13T20:52:01.926+0800: 136.022: [Full GC (Allocation Failure) 
2025-01-13T20:52:01.926+0800: 136.022: [Tenured: 1713535K->1713535K(1713536K), 
3.6111578 secs] 2484607K->2482139K(2484608K), [Metaspace: 
89121K->89121K(1130496K)], 3.6113026 secs] [Times: user=3.20 sys=0.40, 
real=3.61 secs]
2025-01-13T20:52:05.555+0800: 139.651: [Full GC (Allocation Failure) 
2025-01-13T20:52:05.555+0800: 139.651: [Tenured: 1713535K->1713535K(1713536K), 
3.9733441 secs] 2484607K->2482375K(2484608K), [Metaspace: 
89133K->89133K(1130496K)], 3.9734511 secs] [Times: user=3.52 sys=0.45, 
real=3.98 secs]
2025-01-13T20:52:09.548+0800: 143.644: [Full GC (Allocation Failure) 
2025-01-13T20:52:09.548+0800: 143.644: [Tenured: 1713535K->1713535K(1713536K), 
3.3805432 secs] 2484607K->2482897K(2484608K), [Metaspace: 
89134K->89134K(1130496K)], 3.3806496 secs] [Times: user=3.36 sys=0.02, 
real=3.38 secs] {code}
{code:java}
// code placeholder
2025-01-13 20:49:54,563 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.context.StatefulTaskContext 
[] - Starting offset is initialized to {ts_sec=0, file=, pos=0, kind=EARLIEST, 
row=0, event=0}
2025-01-13 20:49:54,631 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Snapshot step 1 - Determining low watermark {ts_sec=0, 
file=mysql-bin.xxxxx, pos=xxxxx, kind=SPECIFIC, gtids=xxxxxxxxx, row=0, 
event=0} for split MySqlSnapshotSplit{tableId=xxxxdb.xxxxx_test_table, 
splitId='xxxxdb.xxxxx_test_table:159959', splitKeyType=[`id` BIGINT NOT NULL], 
splitStart=[1333738235], splitEnd=null, highWatermark=null}
2025-01-13 20:49:54,636 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Snapshot step 2 - Snapshotting data
2025-01-13 20:49:54,636 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exporting data from split 'xxxxdb.xxxxx_test_table:159959' of table 
xxxxdb.xxxxx_test_table
2025-01-13 20:49:54,637 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - For split 'xxxxdb.xxxxx_test_table:159959' of table 
xxxxdb.xxxxx_test_table using select statement: 'SELECT * FROM 
`xxxxdb`.`xxxxx_test_table` WHERE `id` >= ?'
2025-01-13 20:50:17,482 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exported 167463 records for split 'xxxxdb.xxxxx_test_table:159959' after 
00:00:22.846
2025-01-13 20:50:31,627 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exported 409419 records for split 'xxxxdb.xxxxx_test_table:159959' after 
00:00:36.991
2025-01-13 20:50:41,805 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exported 510663 records for split 'xxxxdb.xxxxx_test_table:159959' after 
00:00:47.169
2025-01-13 20:50:55,184 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exported 588220 records for split 'xxxxdb.xxxxx_test_table:159959' after 
00:01:00.548
2025-01-13 20:51:05,580 INFO 
org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask 
[] - Exported 615374 records for split 'xxxxdb.xxxxx_test_table:159959' after 
00:01:10.944 {code}
 
 

 

  was:When synchronizing a large table, the ending chunk is always executed 
last, and splitEnd is null. This causing the task to scan too much data and 
eventually TaskManager OOM.


> MySqlSnapshotSplitAssigner assign the ending chunk early to avoid TaskManager 
> OOM
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-37120
>                 URL: https://issues.apache.org/jira/browse/FLINK-37120
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.2.1
>            Reporter: JunboWang
>            Priority: Minor
>
> When synchronizing a large table, the ending chunk is always executed last, 
> and splitEnd is null. This causing the task to scan too much data and 
> eventually TaskManager OOM.
>  
> Related log:
> {code:java}
> // code placeholder
> 2025-01-13T20:52:01.926+0800: 136.022: [Full GC (Allocation Failure) 
> 2025-01-13T20:52:01.926+0800: 136.022: [Tenured: 
> 1713535K->1713535K(1713536K), 3.6111578 secs] 2484607K->2482139K(2484608K), 
> [Metaspace: 89121K->89121K(1130496K)], 3.6113026 secs] [Times: user=3.20 
> sys=0.40, real=3.61 secs]
> 2025-01-13T20:52:05.555+0800: 139.651: [Full GC (Allocation Failure) 
> 2025-01-13T20:52:05.555+0800: 139.651: [Tenured: 
> 1713535K->1713535K(1713536K), 3.9733441 secs] 2484607K->2482375K(2484608K), 
> [Metaspace: 89133K->89133K(1130496K)], 3.9734511 secs] [Times: user=3.52 
> sys=0.45, real=3.98 secs]
> 2025-01-13T20:52:09.548+0800: 143.644: [Full GC (Allocation Failure) 
> 2025-01-13T20:52:09.548+0800: 143.644: [Tenured: 
> 1713535K->1713535K(1713536K), 3.3805432 secs] 2484607K->2482897K(2484608K), 
> [Metaspace: 89134K->89134K(1130496K)], 3.3806496 secs] [Times: user=3.36 
> sys=0.02, real=3.38 secs] {code}
> {code:java}
> // code placeholder
> 2025-01-13 20:49:54,563 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.context.StatefulTaskContext
>  [] - Starting offset is initialized to {ts_sec=0, file=, pos=0, 
> kind=EARLIEST, row=0, event=0}
> 2025-01-13 20:49:54,631 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Snapshot step 1 - Determining low watermark {ts_sec=0, 
> file=mysql-bin.xxxxx, pos=xxxxx, kind=SPECIFIC, gtids=xxxxxxxxx, row=0, 
> event=0} for split MySqlSnapshotSplit{tableId=xxxxdb.xxxxx_test_table, 
> splitId='xxxxdb.xxxxx_test_table:159959', splitKeyType=[`id` BIGINT NOT 
> NULL], splitStart=[1333738235], splitEnd=null, highWatermark=null}
> 2025-01-13 20:49:54,636 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Snapshot step 2 - Snapshotting data
> 2025-01-13 20:49:54,636 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exporting data from split 'xxxxdb.xxxxx_test_table:159959' of table 
> xxxxdb.xxxxx_test_table
> 2025-01-13 20:49:54,637 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - For split 'xxxxdb.xxxxx_test_table:159959' of table 
> xxxxdb.xxxxx_test_table using select statement: 'SELECT * FROM 
> `xxxxdb`.`xxxxx_test_table` WHERE `id` >= ?'
> 2025-01-13 20:50:17,482 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exported 167463 records for split 'xxxxdb.xxxxx_test_table:159959' 
> after 00:00:22.846
> 2025-01-13 20:50:31,627 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exported 409419 records for split 'xxxxdb.xxxxx_test_table:159959' 
> after 00:00:36.991
> 2025-01-13 20:50:41,805 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exported 510663 records for split 'xxxxdb.xxxxx_test_table:159959' 
> after 00:00:47.169
> 2025-01-13 20:50:55,184 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exported 588220 records for split 'xxxxdb.xxxxx_test_table:159959' 
> after 00:01:00.548
> 2025-01-13 20:51:05,580 INFO 
> org.apache.flink.cdc.connectors.mysql.debezium.task.MySqlSnapshotSplitReadTask
>  [] - Exported 615374 records for split 'xxxxdb.xxxxx_test_table:159959' 
> after 00:01:10.944 {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to