[ 
https://issues.apache.org/jira/browse/FLINK-36270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi Gupta updated FLINK-36270:
-------------------------------
    Description: In DDB Streams connector, while testing we found out that when 
we are spending a lot of time in markAsFinished function because we are calling 
splitsAvailableForAssignment which is O(N), and given n shards can be marked as 
finished concurrently, the algorithm becomes O(n^2). Change the algo to assign 
only child shards when a parent is finished. We can start tracking child shards 
of a shard in SplitTracker  (was: In DDB Streams connector, while testing we 
found out that when we are spending a lot of time in markAsFinished function 
because we are calling splitsAvailableForAssignment which is O(n), and given n 
shards can be marked as finished concurrently, the algorithm becomes O(n^2). 
Change the algo to assign only child shards when a parent is finished. We can 
start tracking child shards of a shard in SplitTracker)

> DDB Streams Connector performance issue due to splitsAvailableForAssignment 
> function
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-36270
>                 URL: https://issues.apache.org/jira/browse/FLINK-36270
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / DynamoDB
>            Reporter: Abhi Gupta
>            Priority: Major
>
> In DDB Streams connector, while testing we found out that when we are 
> spending a lot of time in markAsFinished function because we are calling 
> splitsAvailableForAssignment which is O(N), and given n shards can be marked 
> as finished concurrently, the algorithm becomes O(n^2). Change the algo to 
> assign only child shards when a parent is finished. We can start tracking 
> child shards of a shard in SplitTracker



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to