[ https://issues.apache.org/jira/browse/FLINK-30623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682466#comment-17682466 ]
Rui Fan commented on FLINK-30623: --------------------------------- {quote} What do you think is the actual reason behind the regression? That now we have to enqueue writes from a couple of subtasks one after another, so for example with 2 subtasks, the second has to wait until first completes it's writes? {quote} When two subtasks share the same file, only one subtask can write the file at a time, that is, subtask 2 must wait while subtask1 is writing to the file, so the UC time will become larger. {quote} And what do you think is the impact of this setting in a production setups? {quote} I think the impact on production is minimal. If hdfs writes quickly, the checkpoint time will not increase significantly. Of course, if hdfs writes slowly, it may have an impact, a reasonable solution at this point is: flink or hdfs sre should improve the stability and performance of hdfs. > Performance regression in checkpointSingleInput.UNALIGNED on 04.01.2023 > ----------------------------------------------------------------------- > > Key: FLINK-30623 > URL: https://issues.apache.org/jira/browse/FLINK-30623 > Project: Flink > Issue Type: Bug > Components: Benchmarks, Runtime / Checkpointing > Reporter: Martijn Visser > Assignee: Rui Fan > Priority: Blocker > Labels: pull-request-available > Fix For: 1.17.0 > > > Performance regression > checkpointSingleInput.UNALIGNED median=338.1445195 recent_median=67.6453005 > checkpointSingleInput.UNALIGNED_1 median=213.230041 recent_median=39.830277 > deployAllTasks.STREAMING median=168.533106 recent_median=159.8534395 > stateBackends.MEMORY median=3229.0248875 recent_median=2985.782919 > tupleKeyBy median=4155.684199 recent_median=3987.5812305 > http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=checkpointSingleInput.UNALIGNED&extr=on&quarts=on&equid=off&env=2&revs=200 > http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=checkpointSingleInput.UNALIGNED_1&extr=on&quarts=on&equid=off&env=2&revs=200 > http://codespeed.dak8s.net:8000/timeline/#/?exe=8&ben=deployAllTasks.STREAMING&extr=on&quarts=on&equid=off&env=2&revs=200 > http://codespeed.dak8s.net:8000/timeline/#/?exe=6&ben=stateBackends.MEMORY&extr=on&quarts=on&equid=off&env=2&revs=200 > http://codespeed.dak8s.net:8000/timeline/#/?exe=6&ben=tupleKeyBy&extr=on&quarts=on&equid=off&env=2&revs=200 -- This message was sent by Atlassian Jira (v8.20.10#820010)