codope commented on code in PR #9517:
URL: https://github.com/apache/hudi/pull/9517#discussion_r1328380671


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataWriter.java:
##########
@@ -59,7 +60,18 @@ public interface HoodieTableMetadataWriter extends 
Serializable, AutoCloseable {
    * @param commitMetadata commit metadata of the operation of interest.
    * @param instantTime    instant time of the commit.
    */
-  void update(HoodieCommitMetadata commitMetadata, HoodieData<WriteStatus> 
writeStatuses, String instantTime);
+  void updateFromWriteStatuses(HoodieCommitMetadata commitMetadata, 
HoodieData<WriteStatus> writeStatuses, String instantTime);
+
+  /**
+   * Update the metadata table due to a COMMIT or REPLACECOMMIT operation.
+   * As compared to {@link #updateFromWriteStatuses(HoodieCommitMetadata, 
HoodieData, String)}, this method
+   * directly updates metadata with the given records, instead of first 
converting {@link WriteStatus} to {@link HoodieRecord}.
+   *
+   * @param commitMetadata commit metadata of the operation of interest.
+   * @param records        records to update metadata with.
+   * @param instantTime    instant time of the commit.
+   */
+  void update(HoodieCommitMetadata commitMetadata, HoodieData<HoodieRecord> 
records, String instantTime);

Review Comment:
   Yes, you could do that as well. The main reason behind adding another update 
method to the metadata writer interface was to decouple writing from WriteStats 
(which is convenient in some cases) which is not sufficient to build all types 
of index. Also, write stats is only available during the write and not 
post-write. With `HoodieData<HoodieRecord>`, it adds more flexibility and 
unlocks optimization as you suggested.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to