Hi Upesh,

The answers to your questions are:

1.
The configs cleanup.policy and retention.ms are topic configs. Hence, they only affect the changelog of a state store, not the local state store in a Kafka Streams client.

Locally, window and session stores remove data they do not need anymore. Window and session stores are segmented stores. That means they consist of segments that are ordered by the windows they contain. Once the segment that contains the oldest windows is not needed anymore, i.e., the data exceeded the retention time of the state store, the segment is removed.

Non-windowed state store will not remove data.

Worth noting here: If you change retention.ms directly on the brokers, it will not affect the behavior of local state stores.

2.
Yes, this behavior is the same for in-memory state stores and persistent state stores.

3.
Window and session state stores do remove data.


Best,
Bruno



On 18.04.21 18:18, Upesh Desai wrote:
Hello, I have not been able to find a concrete answer on if/how state stores on a running kafka streams instance remove data when it has passed the configured retention.ms config. So a couple clarification questions:

 1. If the stores are configured with: cleanup.policy=compact,delete AND
    retention.ms=N, will the stores remove data automatically over time
    in the running stream instance stores?
 2. Is this behavior the same for in-memory stores and persistent
    rocksdb stores?
 3. If they do not remove data that has passed the retention.ms period,
    is there a different way to periodically remove old data from the
    stores?

I’m using kafka 2.7.0 components across the board (broker, connect, etc.).

Thanks in advance,
Upesh

<https://www.itrsgroup.com/>

        
Upesh Desai​
Senior Software Developer

*ude...@itrsgroup.com* <mailto:ude...@itrsgroup.com>
*www.itrsgroup.com* <https://www.itrsgroup.com/>

Internet communications are not secure and therefore the ITRS Group does not accept legal responsibility for the contents of this message. Any view or opinions presented are solely those of the author and do not necessarily represent those of the ITRS Group unless otherwise specifically stated.

[itrs.email.signature]



*Disclaimer*

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by *Mimecast Ltd*, an innovator in Software as a Service (SaaS) for business. Providing a *safer* and *more useful* place for your human generated data. Specializing in; Security, archiving and compliance.

Reply via email to