zhangyue19921010 commented on code in PR #13066: URL: https://github.com/apache/hudi/pull/13066#discussion_r2026456692
########## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/commit/DatasetBucketRescaleCommitActionExecutor.java: ########## @@ -73,10 +66,4 @@ protected void preExecute() { ValidationUtils.checkArgument(res); LOG.info("Finish to save hashing config " + hashingConfig); } - - @Override - protected Map<String, List<String>> getPartitionToReplacedFileIds(HoodieData<WriteStatus> writeStatuses) { Review Comment: we still need to overwrite this `getPartitionToReplacedFileIds` in DatasetBucketRescaleCommitActionExecutor ,Get rid of the influence of `String staticOverwritePartition = config.getStringOrDefault(HoodieInternalConfig.STATIC_OVERWRITE_PARTITION_PATHS); `parameter ########## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java: ########## @@ -85,14 +97,49 @@ public void commit(List<WriteStatus> writeStatuses) { try { List<HoodieWriteStat> writeStatList = writeStatuses.stream().map(WriteStatus::getStat).collect(Collectors.toList()); writeClient.commitStats(instantTime, writeStatList, Option.of(extraMetadata), - CommitUtils.getCommitActionType(operationType, metaClient.getTableType())); + CommitUtils.getCommitActionType(operationType, metaClient.getTableType()), getReplacedFileIds(writeStatuses), Option.empty()); } catch (Exception ioe) { throw new HoodieException(ioe.getMessage(), ioe); } finally { writeClient.close(); } } + private Map<String, List<String>> getReplacedFileIds(List<WriteStatus> writeStatuses) { Review Comment: +1 about `Can we move the writes and commits inside the executors?` It is better to let different executor to take care of their own replace logic instead of union together in a big switch or if-else -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org