Mikael,

by "growing out of bounds" we refer to the fact, that the changelog
encodes the keys as pair of <record-key,window-id>. Thus, over time as
we create more and more window, storage requirement grows and grows and
will eventually hit a wall. How fast this happens, depends mainly on
your window advance time (and number of distinct keys as we have windows
for each unique key).

To avoid infinite store grows, we apply a retention period (default 1
day, you can set it via Windows#until() method) to drop old window after
this time. This dropping applies to the state store as well as the
changelog topic (changelog topics can have an additional retention time
similar to topic retention time of non-changelog topics).


For a single window storage size is not too much (it's number of
serialized byte for record-key + windowId (it's basically a timestamp,
ie, 8 bytes for a long) + window aggregate).

> For example, if data is received at a rate of 1 message per second and
> messages are aggregated to a single key using a tumbling window of 1 hour,
> would the size of the compacted changelog (and window store) be 24 records
> after 24 hours?

Basically yes (but per key! -- if you have a single unique key it will
be just 24 windows as you suggested). After compaction ticked in, there
will be one window per key. So the final size of the changelog is
independent of you input rate -- only keep in mind, that if your input
rate is high and compaction did not happen yet, you might need much more
memory. (how often compaction is triggered can be configured on the brokers)


About segments and querying: yes, a segment is a RocksDB instance. We
use segments to efficiently drop old window (ie, we drop a whole segment
if we know that we do not need those windows any more -- as long as a
single window of a segment did not expire we keep the whole segment).
Thus, segments store windows based on time-ranges (non-overlapping) and
the window start time determines the segment it belongs to.

For querying it is not a performance issue, because we require you to
specify the timestamp and can map to the corresponding segments. Thus,
we only access those RocksDB that do cover the time rang of you query.
Of course, if you query time range is large, we need to access multiple
RocksDBs -- but for "point queried" only a single one.


-Matthias


On 12/14/16 1:46 PM, Mikael Högqvist wrote:
> Hi Matthias,
> 
> kind of :)
> 
> I'm interested in the retention mechanisms and my use case is to keep old
> windows around for a long time (up to a year or longer) and access them via
> interactive queries. As I understand from the documentation, the retention
> mechanism is used to avoid changelogs from "growing out of bounds". This is
> a bit unclear to me, what are the storage costs from using a window store?
> For example, if data is received at a rate of 1 message per second and
> messages are aggregated to a single key using a tumbling window of 1 hour,
> would the size of the compacted changelog (and window store) be 24 records
> after 24 hours?
> 
> Are there other potential tradeoffs when using the window store with a long
> retention? E.g., looking at the rocksdb implementation, there is something
> called a segment which seems to correspond to a single rocksdb instance.
> Does that have an effect on querying?
> 
> Best,
> Mikael
> 
> 
> On Wed, Dec 14, 2016 at 6:44 PM Matthias J. Sax <matth...@confluent.io>
> wrote:
> 
> I am not sure if I can follow.
> 
> However, in Kafka Streams using window aggregation, the windowed KTable
> uses a key-value store internally -- it's only called windowed store
> because it encodes the key for the store as pair of
> <record-key:windowId> and also applies a couple of other mechanism with
> regard to retention time to delete old windows.
> 
> Does this answer your question?
> 
> 
> -Matthias
> 
> On 12/14/16 6:46 AM, Mikael Högqvist wrote:
>> Hi,
>>
>> I'm wondering about the tradeoffs when implementing a tumbling window with
>> a long retention, e.g. 1 year. Is it better to use a normal key value
> store
>> and aggregate the time bucket using a group by instead of a window store?
>>
>> Best,
>> Mikael
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to