[ https://issues.apache.org/jira/browse/KUDU-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848725#comment-17848725 ]
Ben Manes commented on KUDU-613: -------------------------------- [~mreddy] if you are able to capture workload traces (the key hash when querying the cache; item size if variable) then we can run it through a [simulator|https://github.com/ben-manes/caffeine/wiki/Simulator]. That would give you a rough expectation of hit rate, ability to try tuning options, etc. Let me know if I can help. The authors of SLRU ([paper|https://ieeexplore.ieee.org/document/268884]) recommended a 20/80 split which does work out well in practice. ARC improved that by making it by adapting the region sizes. TinyLFU provided a better promotion mechanism, and hill climbing W-TinyLFU added in adaptivity in a more robust fashion. Those enhancements rely on maintaining historic knowledge, whereas SLRU is quite nice as a base structure without history. One can of course extend modify SLRU to use CLOCK regions instead for concurrency (avoid locking on access), so there is a lot of room to play based on your goals. > Scan-resistant cache replacement algorithm for the block cache > -------------------------------------------------------------- > > Key: KUDU-613 > URL: https://issues.apache.org/jira/browse/KUDU-613 > Project: Kudu > Issue Type: Improvement > Components: perf > Affects Versions: M4.5 > Reporter: Andrew Wang > Assignee: Mahesh Reddy > Priority: Major > Labels: performance, roadmap-candidate > > The block cache currently uses LRU, which is vulnerable to large scan > workloads. It'd be good to implement something like 2Q. > ARC (patent encumbered, but good for ideas): > https://www.usenix.org/conference/fast-03/arc-self-tuning-low-overhead-replacement-cache > HBase (2Q like): > https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java -- This message was sent by Atlassian Jira (v8.20.10#820010)