hudi-bot opened a new issue, #17079: URL: https://github.com/apache/hudi/issues/17079
Currently the `reader reuse` feature of `HoodieBackedMetadata` is not used due to the integration of FG reader. The problem caused by this is that when there are multiple reads for the same MDT file slice in the same read query, Hudi has to re-open the file multiple times. This may cause performance regression since: 1. One more GET request to S3 to open the file, which brings performance and $ cost. 2. When the file is not close, the cached data blocks and meta data blocks in the memory are not cleared. They can be potentially rescanned. Therefore, we should implement caching feature for underlying readers used by FG reader for either MDT scenarios or generic DT/MDT scenarios. ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-9578 - Type: New Feature - Fix version(s): - 1.1.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
