codope commented on code in PR #9517:
URL: https://github.com/apache/hudi/pull/9517#discussion_r1328380671
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataWriter.java:
##########
@@ -59,7 +60,18 @@ public interface HoodieTableMetadataWriter extends
Serializable, AutoCloseable {
* @param commitMetadata commit metadata of the operation of interest.
* @param instantTime instant time of the commit.
*/
- void update(HoodieCommitMetadata commitMetadata, HoodieData<WriteStatus>
writeStatuses, String instantTime);
+ void updateFromWriteStatuses(HoodieCommitMetadata commitMetadata,
HoodieData<WriteStatus> writeStatuses, String instantTime);
+
+ /**
+ * Update the metadata table due to a COMMIT or REPLACECOMMIT operation.
+ * As compared to {@link #updateFromWriteStatuses(HoodieCommitMetadata,
HoodieData, String)}, this method
+ * directly updates metadata with the given records, instead of first
converting {@link WriteStatus} to {@link HoodieRecord}.
+ *
+ * @param commitMetadata commit metadata of the operation of interest.
+ * @param records records to update metadata with.
+ * @param instantTime instant time of the commit.
+ */
+ void update(HoodieCommitMetadata commitMetadata, HoodieData<HoodieRecord>
records, String instantTime);
Review Comment:
Yes, you could do that as well. The main reason behind adding another update
method to the metadata writer interface was to decouple writing from WriteStats
(which is convenient in some cases) which is not sufficient to build all types
of index. Also, write stats is only available during the write and not
post-write. With `HoodieData<HoodieRecord>`, it adds more flexibility and
unlocks optimization as you suggested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]