parisni commented on issue #6373: URL: https://github.com/apache/hudi/issues/6373#issuecomment-1218559061
> But as you can imagine, this is going to result in huge no of file groups in general and puts lot of pressure on the system Do you mean pressure when cleaning or pressure when reading or in general ? Also insert produces same number of file groups since iam in a case of append only table with no new data in the past. Anyway cleaning is much faster w/o metadata table and that would help to allow to specify configure cleaning to work on disk only On August 17, 2022 9:59:31 PM UTC, Sivabalan Narayanan ***@***.***> wrote: >wrt bulk_insert, I understand cleaning is not going to be any use. coz, every new commit goes into new file groups. Hence there won't be any file groups which will have more file slices which might be eligible for cleaning. But as you can imagine, this is going to result in huge no of file groups in general and puts lot of pressure on the system. > > >-- >Reply to this email directly or view it on GitHub: >https://github.com/apache/hudi/issues/6373#issuecomment-1218528854 >You are receiving this because you authored the thread. > >Message ID: ***@***.***> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
