[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator

Roman Khachatryan (Jira) Fri, 22 May 2020 02:31:31 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113890#comment-17113890
 ]


Roman Khachatryan commented on FLINK-16057:
-------------------------------------------

Got unexpected results on actual files: newer version is faster 
(double-checking).

 

Old version 
([http://codespeed.dak8s.net:8080/job/flink-benchmark-request/162/)|http://codespeed.dak8s.net:8080/job/flink-benchmark-request/162/]

 
{code:java}
"Benchmark","Mode","Threads","Samples","Score","Score Error 
(99.9%)","Unit","Param: folder" 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorIoBenchmark.readFiles","thrpt",1,30,7352.674778,240.954385,"ops/ms",txt-100-1000-10
 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorIoBenchmark.readFiles","thrpt",1,30,5783.989828,102.949992,"ops/ms",txt-1000-100-10
{code}
 

New version 
([http://codespeed.dak8s.net:8080/job/flink-benchmark-request/163/)|http://codespeed.dak8s.net:8080/job/flink-benchmark-request/163/]

 
{code:java}
"Benchmark","Mode","Threads","Samples","Score","Score Error 
(99.9%)","Unit","Param: folder" 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorIoBenchmark.readFiles","thrpt",1,30,16931.351736,551.851266,"ops/ms",txt-100-1000-10
 
"org.apache.flink.benchmark.ContinuousFileReaderOperatorIoBenchmark.readFiles","thrpt",1,30,6156.304362,92.567005,"ops/ms",txt-1000-100-10
{code}
 

 

> Performance regression in ContinuousFileReaderOperator
> ------------------------------------------------------
>
>                 Key: FLINK-16057
>                 URL: https://issues.apache.org/jira/browse/FLINK-16057
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream, Runtime / Task
>    Affects Versions: 1.11.0
>            Reporter: Roman Khachatryan
>            Assignee: Roman Khachatryan
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> After switching CFRO to a single-threaded execution model performance 
> regression was expected to be about 15-20% (benchmarked in November).
> But after merging to master it turned out to be about 50%.
>   
> One reason is that the chaining strategy isn't set by default in CFRO factory.
> Without that even reading and outputting all records of a split in a single 
> mail action doesn't reverse the regression (only about half).
> However,  with strategy set AND batching enabled fixes the regression 
> (starting from batch size 6).
> Though batching can't be used in practice because it can significantly delay 
> checkpointing.
>  
> Another approach would be to process one record and the repeat until 
> defaultMailboxActionAvailable OR haveNewMail.
> This reverses regression and even improves the performance by about 50% 
> compared to the old version.
>  
> The final solution could also be FLIP-27.
>  
> Other things tried (didn't help):
>  * CFRO rework without subsequent commits (removing checkpoint lock)
>  * different batch sizes, including the whole split, without chaining 
> strategy fixed - partial improvement only
>  * disabling close
>  * disabling checkpointing
>  * disabling output (serialization)
>  * using LinkedList instead of PriorityQueue
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16057) Performance regression in ContinuousFileReaderOperator

Reply via email to