[ https://issues.apache.org/jira/browse/HUDI-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939093#comment-17939093 ]
sivabalan narayanan edited comment on HUDI-8532 at 3/27/25 11:42 PM: --------------------------------------------------------------------- Analyzed the usage of ` Option<String> getCompletionTime(String baseInstant, String instantTime); ` esply wrt HoodieFileGroup and File slicing. For every log file, the DC commit time refers to the base instant time.So, based on that the completion time for a given log file always maps to completion time of the base instant time. Which later when we go through the file slice fitting logic, will get fitted into the right file slice. {code:java} Option<String> completionTimeOpt = completionTimeQueryView.getCompletionTime(fileSlices.firstKey(), logFile.getDeltaCommitTime()); if (completionTimeOpt.isPresent()) { for (String commitTime : fileSlices.keySet()) { // find the largest commit time that is smaller than the log delta commit completion time if (compareTimestamps(completionTimeOpt.get(), GREATER_THAN_OR_EQUALS, commitTime)) { return commitTime; } } // no base file that starts earlier than the log delta commit completion time, // use the log file delta commit time. return logFile.getDeltaCommitTime(); } {code} Also tested following scennario. FG1, FS1 basefile C2: added lf1. c3: added lf2 to FS1 and C3 crashed. Disabled rollback and triggered compaction. FG1_FS2 is created. still C3 is inflight. and compaction completed as well. Added new DC and unblocked rollback. Rollback added a new log file to FS1 as expected. and validated that during FSV building, the log file added by rollback is rightly fitted to FS1 and not FS2 even tough the rollback instant time itself is after compaction's instant time. Based on this, we are good to return the state transition time and our file slicing logic is intact. was (Author: shivnarayan): Analyzed the usage of ` Option<String> getCompletionTime(String baseInstant, String instantTime); ` esply wrt HoodieFileGroup and File slicing. For every log file, the DC commit time refers to the base instant time.So, based on that the completion time for a given log file always maps to completion time of the base instant time. Which later when we go through the file slice fitting logic, will get fitted into the right file slice. {code:java} Option<String> completionTimeOpt = completionTimeQueryView.getCompletionTime(fileSlices.firstKey(), logFile.getDeltaCommitTime()); if (completionTimeOpt.isPresent()) { for (String commitTime : fileSlices.keySet()) { // find the largest commit time that is smaller than the log delta commit completion time if (compareTimestamps(completionTimeOpt.get(), GREATER_THAN_OR_EQUALS, commitTime)) { return commitTime; } } // no base file that starts earlier than the log delta commit completion time, // use the log file delta commit time. return logFile.getDeltaCommitTime(); } {code} Also tested following scennario. FG1, FS1 basefile C2: added lf1. c3: added lf2 to FS1 and C3 crashed. Disabled rollback and triggered compaction. FG1_FS2 is created. still C3 is inflight. and compaction completed as well. Added new DC and unblocked rollback. Rollback added a new log file to FS1 as expected. and validated that during FSV building, the log file added by rollback is rightly fitted to FS1 and not FS2 even tough the rollback instant time itself is after compaction's instant time. > Decide if CompletionTimeQuery for table version 6 needs to return begin time > or state transition time > ----------------------------------------------------------------------------------------------------- > > Key: HUDI-8532 > URL: https://issues.apache.org/jira/browse/HUDI-8532 > Project: Apache Hudi > Issue Type: Sub-task > Components: core > Reporter: Balaji Varadarajan > Assignee: sivabalan narayanan > Priority: Blocker > Fix For: 1.0.2 > > Original Estimate: 1h > Remaining Estimate: 1h > > Currently, we return state transition time for table version 6. As we are mot > allowing nbcc, our initial thoughts on this was that this should be ok. But > we should revisit this question before release. > cc [~vinothchandar] -- This message was sent by Atlassian Jira (v8.20.10#820010)