Re: [PR] [HUDI-7961] Optimizing upsert partitioner for prepped write operations [hudi]

via GitHub Wed, 10 Jul 2024 14:07:26 -0700


nsivabalan commented on code in PR #11581:
URL: https://github.com/apache/hudi/pull/11581#discussion_r1673013661



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java:
##########
@@ -86,16 +87,21 @@ public class UpsertPartitioner<T> extends 
SparkHoodiePartitioner<T> {
   private HashMap<Integer, BucketInfo> bucketInfoMap;
 
   protected final HoodieWriteConfig config;
+  private final WriteOperationType operationType;
 
   public UpsertPartitioner(WorkloadProfile profile, HoodieEngineContext 
context, HoodieTable table,
-      HoodieWriteConfig config) {
+                           HoodieWriteConfig config, WriteOperationType 
operationType) {
     super(profile, table);
     updateLocationToBucket = new HashMap<>();
     partitionPathToInsertBucketInfos = new HashMap<>();
     bucketInfoMap = new HashMap<>();
     this.config = config;
+    this.operationType = operationType;
     assignUpdates(profile);
-    assignInserts(profile, context);
+    long totalInserts = 
profile.getInputPartitionPathStatMap().values().stream().mapToLong(stat -> 
stat.getNumInserts()).sum();
+    if (!WriteOperationType.isPreppedWriteOperation(operationType) || 
totalInserts > 0) { // skip if its prepped write operation. or if totalInserts 
= 0.
+      assignInserts(profile, context);

Review Comment:
   yes. thats the main purpose of prepped write operation. we expect the 
location to be set already in them and so we do not invoke tag locations. and 
hence assignInserts() is unnecessary overhead and is a no-op. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-7961] Optimizing upsert partitioner for prepped write operations [hudi]

Reply via email to