guozhangwang opened a new pull request #11252: URL: https://github.com/apache/kafka/pull/11252
This is an alternative approach in parallel to #11235. After several unsuccessful trials to improve its efficiency i've come up with a "slightly" larger approach, which is to use a kv-store instead as the shared store, which would store the value as list<v>. The benefits of this approach are: 1) Only serde once that compose <timestamp, byte, key>, at the outer metered stores, with less byte array copies. 2) Deletes are straight-forward with no scan reads, just a single call to delete all duplicated <timestamp, byte, key> values. 3) Using a KV store has less space amplification than a segmented window store. The cons though: 1) Each put call would be a get-then-write to append to the list; also we would spend a few more bytes to store the list (most likely a singleton list, and hence just 4 more bytes). 2) It's more complicated definitely.. :) The main idea is that since the shared store is actively GC'ed by the expiration logic, not based on time retention, and since that the key format is in <timestamp, byte, key>, the range expiration query is quite efficient as well. This is not a final PR as you can see I had many quick-hacks on serdes etc, but just to illustrate the idea. I plan to run the benchmarks to see how it behaves compare with the other and if people agree with this approach, I will refine it to be cleaner. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org