zhangyue19921010 commented on code in PR #13066:
URL: https://github.com/apache/hudi/pull/13066#discussion_r2026456692


##########
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/commit/DatasetBucketRescaleCommitActionExecutor.java:
##########
@@ -73,10 +66,4 @@ protected void preExecute() {
     ValidationUtils.checkArgument(res);
     LOG.info("Finish to save hashing config " + hashingConfig);
   }
-
-  @Override
-  protected Map<String, List<String>> 
getPartitionToReplacedFileIds(HoodieData<WriteStatus> writeStatuses) {

Review Comment:
   we still need to overwrite this `getPartitionToReplacedFileIds`  in 
DatasetBucketRescaleCommitActionExecutor ,Get rid of the influence of `String 
staticOverwritePartition = 
config.getStringOrDefault(HoodieInternalConfig.STATIC_OVERWRITE_PARTITION_PATHS);
 `parameter



##########
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java:
##########
@@ -85,14 +97,49 @@ public void commit(List<WriteStatus> writeStatuses) {
     try {
       List<HoodieWriteStat> writeStatList = 
writeStatuses.stream().map(WriteStatus::getStat).collect(Collectors.toList());
       writeClient.commitStats(instantTime, writeStatList, 
Option.of(extraMetadata),
-          CommitUtils.getCommitActionType(operationType, 
metaClient.getTableType()));
+          CommitUtils.getCommitActionType(operationType, 
metaClient.getTableType()), getReplacedFileIds(writeStatuses), Option.empty());
     } catch (Exception ioe) {
       throw new HoodieException(ioe.getMessage(), ioe);
     } finally {
       writeClient.close();
     }
   }
 
+  private Map<String, List<String>> getReplacedFileIds(List<WriteStatus> 
writeStatuses) {

Review Comment:
    +1 about `Can we move the writes and commits inside the executors?` It is 
better to let different executor to take care of their own replace logic 
instead of union together in a big switch or if-else



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to