Re: [PR] [HUDI-9526] Use HoodieFileGroupReader throughout the CDC flow [hudi]

via GitHub Mon, 07 Jul 2025 21:46:59 -0700


danny0405 commented on code in PR #13444:
URL: https://github.com/apache/hudi/pull/13444#discussion_r2191460353



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/FileGroupRecordBuffer.java:
##########
@@ -565,27 +570,62 @@ protected boolean hasNextBaseRecord(T baseRecord, 
BufferedRecord<T> logRecordInf
       Pair<Boolean, T> isDeleteAndRecord = merge(baseRecordInfo, 
logRecordInfo);
       if (!isDeleteAndRecord.getLeft()) {
         // Updates
-        nextRecord = readerContext.seal(isDeleteAndRecord.getRight());
+        nextRecord = 
readerContext.seal(applyOutputSchemaConversion(isDeleteAndRecord.getRight()));

Review Comment:
   > Don't forget SortedKeyBasedFileGroupRecordBuffer
   
   This is used for MDT right? do we need to log cdc in such scenarios?
   
   > If we really care about a single check that happens for a subset of rows, 
why isn't the same scrutiny applied to other PRs regardless of whether it is 
new logic or not
   
   We did it when Lin introduces more merging mode to FG reader, that is why we 
added the `BufferedRecordMerger` absractions, sub-classing the file group 
record buffer is just a suggestion, you can think other alternatives to get rid 
of the new introduced per-row handling and keep the core merging logic clean 
and clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-9526] Use HoodieFileGroupReader throughout the CDC flow [hudi]

Reply via email to