smjn opened a new pull request, #18157:
URL: https://github.com/apache/kafka/pull/18157

   * We have implemented the share state batch combining logic in 
`PersisterStateBatchCombiner` and it can be broken into 2 parts - batch pruning 
(based on start offset) and overlapping batch combination.
   * While batch pruning is a simple O(n) operation, combination is O(nlogn) 
with additional overhead of `TreeSet` data structure.
   * It can be observed that the read, write RPC caller will only receive 
batches from the coordinator when they issue a read state call. Additionally, 
writes are very frequent compared to reads (only on share partition 
initialization).
   * In light of above points, we can trade memory for better running time.
   * In this PR we have done some tuning to the combine logic which is as 
follows:
     * Allow `PersisterStateBatchCombiner` to combine batching with only 
pruning (using a boolean argument as flag).
     * For share update calls (much more frequent than share snapshot - default 
500 per shapshot), batch combining happens with prune only.
     * For share snapshot calls combining happens with deep merging.
     * For read calls combining happens with deep merging.
     * For replay calls for share update merging is prune only which for 
snapshot, no merging is required
     * There is a corner case when we receive a write RPC for the first time 
for a new share partition. In this we have chosen to go with prune only merging.
   * So, in light of these our compute is distributed such that frequent 
operations are fast while infrequent ones bear the burden of heavy lifting 
while minor pruning is done in all operations to optimize for memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to