zhangyue19921010 commented on code in PR #13060:
URL: https://github.com/apache/hudi/pull/13060#discussion_r2028871226
##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartitionBucketIndexHashingConfig.java:
##########
@@ -196,24 +196,51 @@ public static Option<PartitionBucketIndexHashingConfig>
loadHashingConfig(Hoodie
/**
* Get Latest committed hashing config instant to load.
+ * If instant is empty, then return latest hashing config instant
*/
- public static String
getLatestHashingConfigInstantToLoad(HoodieTableMetaClient metaClient) {
+ public static Option<String>
getHashingConfigInstantToLoad(HoodieTableMetaClient metaClient, Option<String>
instant) {
try {
List<String> allCommittedHashingConfig =
getCommittedHashingConfigInstants(metaClient);
- return allCommittedHashingConfig.get(allCommittedHashingConfig.size() -
1);
+ if (instant.isPresent()) {
+ Option<String> res =
getHashingConfigInstantToLoadBeforeOrOn(allCommittedHashingConfig,
instant.get());
+ // fall back to look up archived hashing config instant before return
empty
+ return res.isPresent() ? res :
getHashingConfigInstantToLoadBeforeOrOn(getArchiveHashingConfigInstants(metaClient),
instant.get());
+ } else {
+ return
Option.of(allCommittedHashingConfig.get(allCommittedHashingConfig.size() - 1));
+ }
} catch (Exception e) {
throw new HoodieException("Failed to get hashing config instant to
load.", e);
}
}
+ private static Option<String>
getHashingConfigInstantToLoadBeforeOrOn(List<String> hashingConfigInstants,
String instant) {
Review Comment:
For example
DeltaCommit1 ==> Write `C1_File1, C1_File2`
Bucket-Rescale Commit2 ==> Write `C2_File1, C2_File2, C2_File3(Replaced
C1_File1, C1_File2)`
DeltaCommit 3 ==> Write `C3_File1_Log1`
Bucket-Rescale Commit4 ==> Write `C4_File1(Replaced C2_File1, C2_File2,
C2_File3 and C3_File1_Log1)`
For Sql `SELECT * FROM hudi_table TIMESTAMP AS OF <DeltaCommit 3>`, we need
to load Bucket-Rescale Commit2 instead of load latest hashing config
Bucket-Rescale Commit4
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]