nsivabalan commented on a change in pull request #1704:
URL: https://github.com/apache/hudi/pull/1704#discussion_r536932812



##########
File path: 
hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##########
@@ -113,6 +113,9 @@
   public static final String MAX_CONSISTENCY_CHECKS_PROP = 
"hoodie.consistency.check.max_checks";
   public static int DEFAULT_MAX_CONSISTENCY_CHECKS = 7;
 
+  private static final String PAYLOAD_ORDERING_FIELD_PROP = 
"hoodie.payload.ordering.field";

Review comment:
       gotcha. here is my take. 
   Will introduce a config called "honorOrderingToCombineRecordsAcrossBatches" 
(we can work out a good naming). default value is false. Will send in 
orderingFieldKey (same as preCombineFieldKey) as an arg to 
OverwriteWithLatestAvroPayload.
   
   ```
   public OverwriteWithLatestAvroPayload(GenericRecord record, Comparable 
orderingVal, 
   String orderingFieldKey, boolean honorOrderingToCombineRecordsAcrossBatches) 
{
   .
   .
   ```
   Based on "honorOrderingToCombineRecordsAcrossBatches" value, 
combineAndGetUpdateValue() impl will decide to go with existing impl or new one 
(as we see in this patch) 
   
   I know we do have a gap here where in, if ordering field has changed over 
time, we can't do much here. But we can't afford to store the ordering field as 
a separate column in dataset either, as diff commits could theoretically have 
diff ordering field. But guess we can call it out that we may not support such 
evolution of ordering field. 
   
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to