[
https://issues.apache.org/jira/browse/KAFKA-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169189#comment-17169189
]
Guozhang Wang commented on KAFKA-8027:
--------------------------------------
We encountered similar issues in our benchmarks which is based on recent Kafka
versions as well. Looking at the profiler graph, there are three big buckets:
1) byte-buffer allocation for concatenating the segmented key from raw key /
timestamp. ~10%
2) synchronization on the cache layer to access cache to get the iterator. ~20%
3) putting all the range keys into a tree-map (i.e. a putAll will be called)
before iterating them to achieve thread safety. ~60%
Among those, I've had some ideas to optimize 1), and is still digging around
how to make 2) / 3) to be less costly. I will try to prepare a PR in our
benchmarks and post the results here.
> Gradual decline in performance of CachingWindowStore provider when number of
> keys grow
> --------------------------------------------------------------------------------------
>
> Key: KAFKA-8027
> URL: https://issues.apache.org/jira/browse/KAFKA-8027
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 2.1.0
> Reporter: Prashant
> Priority: Major
> Labels: interactivequ, kafka-streams
>
> We observed this during a performance test of our stream application which
> tracks user's activity and provides REST interface to query the window state
> store. We used default configuration of Materialized i.e. withCachingEnabled
> for storing user behaviour stats in a window state store
> (CompositeWindowStore with CachingWindowStore as underlyin which internally
> uses RocksDBStore for persistent).
> While querying window store with store.fetch(key, long, long), it internally
> tries to fetch the range from ThreadCache which uses a byte iterator to
> search for a key in cache and on a cache miss it goes to RocksDBStore for
> result. So, when number of keys in cache becomes large this ThreadCache
> search starts taking time (range Iterator on all keys) which impacts
> WindowStore query performance.
>
> Workaround: If we disable cache with switch on Materialized instance i.e.
> withCachingDisabled, key search is delegated directly to RocksDBStore which
> is way faster and completed search in microseconds against millis in case of
> CachingWindowStore.
>
> Stats: With Unique users > 0.5M, random search for a key i.e. UserId:
>
> withCachingEnabled : 40 < t < 80ms (upper bound increases as unique users
> grow)
> withCahingDisabled: t < 1ms (Almost constant time)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)