[ https://issues.apache.org/jira/browse/KAFKA-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169189#comment-17169189 ]
Guozhang Wang commented on KAFKA-8027: -------------------------------------- We encountered similar issues in our benchmarks which is based on recent Kafka versions as well. Looking at the profiler graph, there are three big buckets: 1) byte-buffer allocation for concatenating the segmented key from raw key / timestamp. ~10% 2) synchronization on the cache layer to access cache to get the iterator. ~20% 3) putting all the range keys into a tree-map (i.e. a putAll will be called) before iterating them to achieve thread safety. ~60% Among those, I've had some ideas to optimize 1), and is still digging around how to make 2) / 3) to be less costly. I will try to prepare a PR in our benchmarks and post the results here. > Gradual decline in performance of CachingWindowStore provider when number of > keys grow > -------------------------------------------------------------------------------------- > > Key: KAFKA-8027 > URL: https://issues.apache.org/jira/browse/KAFKA-8027 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.1.0 > Reporter: Prashant > Priority: Major > Labels: interactivequ, kafka-streams > > We observed this during a performance test of our stream application which > tracks user's activity and provides REST interface to query the window state > store. We used default configuration of Materialized i.e. withCachingEnabled > for storing user behaviour stats in a window state store > (CompositeWindowStore with CachingWindowStore as underlyin which internally > uses RocksDBStore for persistent). > While querying window store with store.fetch(key, long, long), it internally > tries to fetch the range from ThreadCache which uses a byte iterator to > search for a key in cache and on a cache miss it goes to RocksDBStore for > result. So, when number of keys in cache becomes large this ThreadCache > search starts taking time (range Iterator on all keys) which impacts > WindowStore query performance. > > Workaround: If we disable cache with switch on Materialized instance i.e. > withCachingDisabled, key search is delegated directly to RocksDBStore which > is way faster and completed search in microseconds against millis in case of > CachingWindowStore. > > Stats: With Unique users > 0.5M, random search for a key i.e. UserId: > > withCachingEnabled : 40 < t < 80ms (upper bound increases as unique users > grow) > withCahingDisabled: t < 1ms (Almost constant time) -- This message was sent by Atlassian Jira (v8.3.4#803005)