scxwhite commented on a change in pull request #5030: URL: https://github.com/apache/hudi/pull/5030#discussion_r827820051
########## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java ########## @@ -280,8 +281,11 @@ HoodieCompactionPlan generateCompactionPlan( .getLatestFileSlices(partitionPath) .filter(slice -> !fgIdsInPendingCompactionAndClustering.contains(slice.getFileGroupId())) .map(s -> { + // In most business scenarios, the latest data is in the latest delta log file, so we sort it from large + // to small according to the instance time, which can largely avoid rewriting the data in the + // compact process, and then optimize the compact time List<HoodieLogFile> logFiles = Review comment: > Kind of got your idea, then i think we should always use the reverse order and the comparing sequence in merge reader should also be reversed to keep the process time semantics. yes. did you say it's here?( https://github.com/apache/hudi/pull/5030/files#diff-c2f73f1ce4c0687cffa73e96b82514aca3a930ec1a8bc0c2efd73d7cf869c883R150) If so, the above has been modified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org