[jira] [Comment Edited] (FLINK-19994) All vertices in an DataSet iteration job will be eagerly scheduled

Andrey Zagrebin (Jira) Thu, 05 Nov 2020 04:50:41 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226698#comment-17226698
 ]


Andrey Zagrebin edited comment on FLINK-19994 at 11/5/20, 12:49 PM:
--------------------------------------------------------------------

This makes sense to me. We already have computeStronglyConnectedComponents to 
put vertexes of one iteration into one region. I am running 
[CI|https://dev.azure.com/azagrebin/azagrebin/_build/results?buildId=340&view=results]
 to see if there are any failures for existing iteration tests. We probably 
also need some test with failover for iterations if there is none yet.

[~zhuzh] was buildOneRegionForAllVertices done because it was not clear whether 
one iteration can contain blocking edges or not? or what was the possible 
reason for head/tail being not in the same pipelined failover region?


was (Author: azagrebin):
This makes sense to me. We already have computeStronglyConnectedComponents to 
put vertexes of one iteration into one region.

[~zhuzh] was buildOneRegionForAllVertices done because it was not clear whether 
one iteration can contain blocking edges or not? or what was the possible 
reason for head/tail being not in the same pipelined failover region?

> All vertices in an DataSet iteration job will be eagerly scheduled
> ------------------------------------------------------------------
>
>                 Key: FLINK-19994
>                 URL: https://issues.apache.org/jira/browse/FLINK-19994
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.0
>            Reporter: Zhu Zhu
>            Priority: Blocker
>             Fix For: 1.12.0
>
>
> After switching to pipelined region scheduling, all vertices in an DataSet 
> iteration job will be eagerly scheduled, which means BLOCKING result 
> consumers can be deployed even before the result finishes and resource waste 
> happens. This is because all vertices will be put into one pipelined region 
> if the job contains {{ColocationConstraint}}, see 
> [PipelinedRegionComputeUtil|https://github.com/apache/flink/blob/c0f382f5f0072441ef8933f6993f1c34168004d6/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/PipelinedRegionComputeUtil.java#L52].
> IIUC, this {{makeAllOneRegion()}} behavior was introduced to ensure 
> co-located iteration head and tail to be restarted together in pipelined 
> region failover. However, given that edges within an iteration will always be 
> PIPELINED 
> ([ref|https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]),
>  co-located iteration head and tail will always be in the same region. So I 
> think we can drop the {{PipelinedRegionComputeUtil#makeAllOneRegion()}} code 
> path and build regions in the the same way no matter if there is co-location 
> constraints or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-19994) All vertices in an DataSet iteration job will be eagerly scheduled

Reply via email to