nsivabalan commented on code in PR #13402:
URL: https://github.com/apache/hudi/pull/13402#discussion_r2136891309


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1096,6 +1119,126 @@ public void buildMetadataPartitions(HoodieEngineContext 
engineContext, List<Hood
     initializeFromFilesystem(instantTime, partitionTypes, Option.empty());
   }
 
+  public void startCommit(String instantTime) {
+    ValidationUtils.checkState(streamingWritesEnabled, "Streaming writes 
should be enabled for startCommit API");
+
+    if 
(!metadataMetaClient.getActiveTimeline().getCommitsTimeline().containsInstant(instantTime))
 {
+      // if this is a new commit being applied to metadata for the first time
+      LOG.info("New commit at {} being applied to MDT.", instantTime);
+    } else {
+      throw new HoodieMetadataException("Starting the same commit in Metadata 
table more than once w/o rolling back : " + instantTime);
+    }
+
+    // this is where we might instantiate the write client to metadata table 
for the first time.
+    getWriteClient().startCommitForMetadataTable(metadataMetaClient, 
instantTime, HoodieTimeline.DELTA_COMMIT_ACTION);
+  }
+
+  @Override
+  public HoodieData<WriteStatus> 
streamWriteToMetadataPartitions(HoodieData<WriteStatus> writeStatus, String 
instantTime) {
+    List<MetadataPartitionType> mdtPartitionsToTag = new 
ArrayList<>(enabledPartitionTypes);
+    mdtPartitionsToTag.remove(FILES);
+    mdtPartitionsToTag.retainAll(STREAMING_WRITES_SUPPORTED_PARTITIONS);

Review Comment:
   Note to reviewer:
   In this patch we are only adding RLI partition to streaming flow. All other 
partitions will take up the 2nd write to metadata table i.e. non streaming 
writes. 
   Overall, both writes to mdt put together will ensure all enabled partitions 
are written to MDT. 
   
   In future, we will move 1 partition from non streaming write to streaming 
write flow. This is designed this way, so that we don't need to drag the patch 
until all partitions are added to streaming way of writing to mdt. 
   
   And also, from inconsistency standpoint, its only RLI and Secondary index 
thats of very critical that definitely needs to be moved to streaming manner. 
All others are solely reliant on HoodieCommitMetadata which does not really 
matter much if we do streaming way or non streaming way. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to