Davis-Zhang-Onehouse commented on code in PR #13489:
URL: https://github.com/apache/hudi/pull/13489#discussion_r2180621088


##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieEngineContext.java:
##########
@@ -129,4 +129,21 @@ public abstract <I, K, V> List<V> reduceByKey(
   public abstract <I, O> O aggregate(HoodieData<I> data, O zeroValue, 
Functions.Function2<O, I, O> seqOp, Functions.Function2<O, O, O> combOp);
 
   public abstract <T> ReaderContextFactory<T> 
getReaderContextFactory(HoodieTableMetaClient metaClient);
+
+  /**
+   * Groups values by key and applies a function to each group of values.
+   * [1 iterator maps to 1 key] It only guarantees that items returned by the 
same iterator shares to the same key.
+   * [exact once across iterators] The item returned by the same iterator will 
not be returned by other iterators.
+   * [1 key maps to >= 1 iterators] Items belong to the same shard can be 
load-balanced across multiple iterators. It's up to API implementations to 
decide
+   *                                load balancing pattern and how many 
iterators to split into.
+   *
+   * @param data The input pair<ShardIndex, Item> to process.
+   * @param func Function to apply to each group of items with the same shard
+   * @param maxShardIndex The range of ShardIndex in data parameter. If data 
contain ShardIndex 1,2,6, any maxShardIndex >=6 is valid.
+   * @param preservesPartitioning whether to preserve partitioning in the 
resulting collection.

Review Comment:
   would be great if there is some further clarity on the criteria that code 
contributors can operate with



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to