szetszwo commented on PR #8243: URL: https://github.com/apache/ozone/pull/8243#issuecomment-2790268291
> ... Thus indirectly the table would be iterated based on the number of sst files as there are on the DB. ... What are the assumptions of the performance improvement? - Reading multiple files using multiple threads is faster than using a single thread? We have - iterator reading files (rocksdb) -> processing the entries (Ozone) Instead of having multi-thread reading files, it is better to have multi-thread processing data. Rocksdb itself is already very good for parallelism. It is unlikely Ozone could use the internal details in rocksdb to improve the performance. Also, Ozone should use only the public APIs in Rocksdb. It is hard to maintain such code. It may even causes data corruption silently. BTW, you may consider parallelizing your pull requests -- having multiple small PRs instead of having a single large PR. Then, different people can review different PRs at the same time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
