[
https://issues.apache.org/jira/browse/KAFKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497236#comment-17497236
]
Zhongda Zhao commented on KAFKA-4212:
-------------------------------------
We thought we could have used {{Materialized.withRetention}} for a normal
{{KeyValueStore}} to have a compact change-log topic with matching
{{retention.ms}} and a RocksDB with matching TTL in seconds.
Our use case: we want to use kafka-streams to count huge amount of records,
based on different key-combinations, but on a monthly basis. Due to different
lengths of each month, we didn't use timed window due to fixed window size
(also have to deal with out-of-order records without deterministic grace
period). Our current work-round is to make year-month of event time part of
group key. For interactive query we just do prefix scan. It works for our
purpose until we found out {{withRetention}} is not applicable to normal
{{KeyValueStore}} and the change-log topic doesn't have matching retention
either (the latter might be possible with Processor API).
That said, we prefer using "kafka-layer" configuration "retention" to have
consistent change-log topic and underlying store settings. Implementation
details like TTL in seconds from RocksDB can be hidden.
Any alternative solution suggestions for our use case are more than welcome.
> Add a key-value store that is a TTL persistent cache
> ----------------------------------------------------
>
> Key: KAFKA-4212
> URL: https://issues.apache.org/jira/browse/KAFKA-4212
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Affects Versions: 0.10.0.1
> Reporter: Elias Levy
> Priority: Major
> Labels: api
>
> Some jobs needs to maintain as state a large set of key-values for some
> period of time. I.e. they need to maintain a TTL cache of values potentially
> larger than memory.
> Currently Kafka Streams provides non-windowed and windowed key-value stores.
> Neither is an exact fit to this use case.
> The {{RocksDBStore}}, a {{KeyValueStore}}, stores one value per key as
> required, but does not support expiration. The TTL option of RocksDB is
> explicitly not used.
> The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment
> dropping, but it stores multiple items per key, based on their timestamp.
> But this store can be repurposed as a cache by fetching the items in reverse
> chronological order and returning the first item found.
> KAFKA-2594 introduced a fixed-capacity in-memory LRU caching store, but here
> we desire a variable-capacity memory-overflowing TTL caching store.
> Although {{RocksDBWindowsStore}} can be repurposed as a cache, it would be
> useful to have an official and proper TTL cache API and implementation.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)