[jira] [Commented] (FLINK-16001) Avoid using Java Streams in construction of ExecutionGraph

Jiayi Liao (Jira) Sat, 14 Mar 2020 07:34:00 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059375#comment-17059375
 ]


Jiayi Liao commented on FLINK-16001:
------------------------------------

[~gjy] Thanks for reminding.

I've finished a [jmh 
testing|https://github.com/Jiayi-Liao/jmh-flink-test/blob/master/src/main/java/org/sample/MyBenchmark.java]
 yesterday to test the inner logic of {{toPipelinedRegionSet}} with different 
implementations(current java stream implementation and non java stream 
implementation). The result is shown in attachment.   [^benchmark.csv]  And 
according to the test, the performance degradtion is more obvious with the 
distinct regions cardinality growing.

I think the distinct regions cardinality can be very large especially in batch 
jobs (more than 10k in our production environment) when using {{BLOCKING}} 
result partition. But you're right that this is not the main bottleneck in job 
submission. I'm just trying to improve the performance a little bit from the 
code style aspect.

> Avoid using Java Streams in construction of ExecutionGraph
> ----------------------------------------------------------
>
>                 Key: FLINK-16001
>                 URL: https://issues.apache.org/jira/browse/FLINK-16001
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Jiayi Liao
>            Priority: Major
>         Attachments: benchmark.csv
>
>
> I think we should avoid {{Java Streams}} in construction of 
> {{ExecutionGraph}} like function {{toPipelinedRegionsSet}} in 
> {{PipelinedRegionComputeUtil}} because the job submission is definitely 
> performance sensitive, especially when {{distinctRegions}} has a large 
> cardinality.
> Also includes some other places in package 
> {{org.apache.flink.runtime.executiongraph}}
> cc [~trohrmann] [~gjy] [~zhuzh] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16001) Avoid using Java Streams in construction of ExecutionGraph

Reply via email to