[ https://issues.apache.org/jira/browse/FLINK-36270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Abhi Gupta updated FLINK-36270: ------------------------------- Description: In DDB Streams connector, while testing we found out that when we are spending a lot of time in markAsFinished function because we are calling splitsAvailableForAssignment which is O(N), and given n shards can be marked as finished concurrently, the algorithm becomes O(n^2). Change the algo to assign only child shards when a parent is finished. We can start tracking child shards of a shard in SplitTracker (was: In DDB Streams connector, while testing we found out that when we are spending a lot of time in markAsFinished function because we are calling splitsAvailableForAssignment which is O(n), and given n shards can be marked as finished concurrently, the algorithm becomes O(n^2). Change the algo to assign only child shards when a parent is finished. We can start tracking child shards of a shard in SplitTracker) > DDB Streams Connector performance issue due to splitsAvailableForAssignment > function > ------------------------------------------------------------------------------------ > > Key: FLINK-36270 > URL: https://issues.apache.org/jira/browse/FLINK-36270 > Project: Flink > Issue Type: Bug > Components: Connectors / DynamoDB > Reporter: Abhi Gupta > Priority: Major > > In DDB Streams connector, while testing we found out that when we are > spending a lot of time in markAsFinished function because we are calling > splitsAvailableForAssignment which is O(N), and given n shards can be marked > as finished concurrently, the algorithm becomes O(n^2). Change the algo to > assign only child shards when a parent is finished. We can start tracking > child shards of a shard in SplitTracker -- This message was sent by Atlassian Jira (v8.20.10#820010)