xintongsong commented on code in PR #20553: URL: https://github.com/apache/flink/pull/20553#discussion_r947407205
########## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsFileDataManager.java: ########## @@ -147,6 +148,17 @@ public void setup() { // concurrent `run()` executions. Concurrent calls to other methods are allowed. public synchronized void run() { int numBuffersRead = tryRead(); + if (numBuffersRead == 0) { + try { + // When fileReader has no data to read, for example, most of the data is consumed + // from memory. HsFileDataManager will encounter busy-loop problem, which will lead + // to a meaningless surge in CPU utilization and seriously affect performance. Sleep + // for a very short time to avoid this. + TimeUnit.MILLISECONDS.sleep(5); + } catch (InterruptedException exp) { + throw new RuntimeException("FileDataManager's sleep is interrupted.", exp); + } + } Review Comment: I'm not sure this is the right place to sleep. Ideally, we should first end the current round, then sleep if needed before triggering the next round, rather than sleep before ending the current round. Moreover, it sleeps while holding the lock. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org