nsivabalan commented on issue #3975: URL: https://github.com/apache/hudi/issues/3975#issuecomment-1246008157
hey @dmenin : sorry to have dropped the ball on this. Don't think we have anything concrete towards your proposal yet. Might be tricky to make it work for any partitioning logic. for day based, we can choose the latest 10, but if its product_id based or something else, it may not be feasible to choose the latest 10 partitions. One option I can think of, is to increase the file size. Also, you can enable clustering to batch 100MB files to may be 500MB files to see how your index lookup improves. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
