[I] Support reader caching inside FG reader [hudi]

via GitHub Sun, 30 Nov 2025 04:39:33 -0800


hudi-bot opened a new issue, #17079:
URL: https://github.com/apache/hudi/issues/17079


   Currently the `reader reuse` feature of `HoodieBackedMetadata` is not used 
due to the integration of FG reader.
   
   The problem caused by this is that when there are multiple reads for the 
same MDT file slice in the same read query, Hudi has to re-open the file 
multiple times. This may cause performance regression since:
   1. One more GET request to S3 to open the file, which brings performance and 
$ cost.
   2. When the file is not close, the cached data blocks and meta data blocks 
in the memory are not cleared. They can be potentially rescanned.
   
    
   
   Therefore, we should implement caching feature for underlying readers used 
by FG reader for either MDT scenarios or generic DT/MDT scenarios.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-9578
   - Type: New Feature
   - Fix version(s):
     - 1.1.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Support reader caching inside FG reader [hudi]

Reply via email to