yihua opened a new pull request, #13254: URL: https://github.com/apache/hudi/pull/13254
### Change Logs This PR unifies logic of fetching files and file slices in the metadata table writer so the index initialization is only based on two types of information of the file system view: - (1) `partitionIdToAllFilesMap`: all the files in a table used by `FILES`, `BLOOM_FILTERS`, and `COLUMN_STATS` partitions; - (2) `lazyLatestMergedPartitionFileSliceList` (lazily evaluated only if needed): latest merged file slices used by `RECORD_INDEX`, `EXPRESSION_INDEX`, `PARTITION_STATS`, and `SECONDARY_INDEX` partitions. Note that these two may be further unified, which is out of the scope of this PR. These two types of information are good enough for two types of indexes, one type based on all files and the other based on the latest merged file slices. ### Impact Code simplification for MDT writer refactoring ### Risk level none ### Documentation Update none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
