[ 
https://issues.apache.org/jira/browse/HUDI-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939093#comment-17939093
 ] 

sivabalan narayanan edited comment on HUDI-8532 at 3/27/25 11:42 PM:
---------------------------------------------------------------------

Analyzed the usage of 

`

Option<String> getCompletionTime(String baseInstant, String instantTime);

`

esply wrt HoodieFileGroup and File slicing. 

 

For every log file, the DC commit time refers to the base instant time.So, 
based on that the completion time for a given log file always maps to 
completion time of the base instant time. Which later when we go through the 
file slice fitting logic, will get fitted into the right file slice. 

 
{code:java}
Option<String> completionTimeOpt = 
completionTimeQueryView.getCompletionTime(fileSlices.firstKey(), 
logFile.getDeltaCommitTime());
if (completionTimeOpt.isPresent()) {
  for (String commitTime : fileSlices.keySet()) {
    // find the largest commit time that is smaller than the log delta commit 
completion time
    if (compareTimestamps(completionTimeOpt.get(), GREATER_THAN_OR_EQUALS, 
commitTime)) {
      return commitTime;
    }
  }
  // no base file that starts earlier than the log delta commit completion time,
  // use the log file delta commit time.
  return logFile.getDeltaCommitTime();
} {code}
 

Also tested following scennario. 

 

FG1, FS1

basefile

C2: added lf1. 

c3: added lf2 to FS1 

and C3 crashed. 

 

Disabled rollback and triggered compaction. 

 

FG1_FS2 is created. 

still C3 is inflight. and compaction completed as well. 

 

Added new DC and unblocked rollback. 

Rollback added a new log file to FS1 as expected. 

 

and validated that during FSV building, the log file added by rollback is 
rightly fitted to FS1 and not FS2 even tough the rollback instant time itself 
is after compaction's instant time. 

 

Based on this, we are good to return the state transition time and our file 
slicing logic is intact. 

 

 

 

 

 


was (Author: shivnarayan):
Analyzed the usage of 

`

Option<String> getCompletionTime(String baseInstant, String instantTime);

`

esply wrt HoodieFileGroup and File slicing. 

 

For every log file, the DC commit time refers to the base instant time.So, 
based on that the completion time for a given log file always maps to 
completion time of the base instant time. Which later when we go through the 
file slice fitting logic, will get fitted into the right file slice. 

 
{code:java}
Option<String> completionTimeOpt = 
completionTimeQueryView.getCompletionTime(fileSlices.firstKey(), 
logFile.getDeltaCommitTime());
if (completionTimeOpt.isPresent()) {
  for (String commitTime : fileSlices.keySet()) {
    // find the largest commit time that is smaller than the log delta commit 
completion time
    if (compareTimestamps(completionTimeOpt.get(), GREATER_THAN_OR_EQUALS, 
commitTime)) {
      return commitTime;
    }
  }
  // no base file that starts earlier than the log delta commit completion time,
  // use the log file delta commit time.
  return logFile.getDeltaCommitTime();
} {code}
 

Also tested following scennario. 

 

FG1, FS1

basefile

C2: added lf1. 

c3: added lf2 to FS1 

and C3 crashed. 

 

Disabled rollback and triggered compaction. 

 

FG1_FS2 is created. 

still C3 is inflight. and compaction completed as well. 

 

Added new DC and unblocked rollback. 

Rollback added a new log file to FS1 as expected. 

 

and validated that during FSV building, the log file added by rollback is 
rightly fitted to FS1 and not FS2 even tough the rollback instant time itself 
is after compaction's instant time. 

 

 

 

 

 

 

 

> Decide if CompletionTimeQuery for table version 6 needs to return begin time 
> or state transition time
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-8532
>                 URL: https://issues.apache.org/jira/browse/HUDI-8532
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Balaji Varadarajan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 1.0.2
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently, we return state transition time for table version 6. As we are mot 
> allowing nbcc, our initial thoughts on this was that this should be ok. But 
> we should revisit this question before release.
> cc [~vinothchandar] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to