[ 
https://issues.apache.org/jira/browse/KAFKA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013159#comment-16013159
 ] 

Guozhang Wang commented on KAFKA-5256:
--------------------------------------

Thanks for the explanation. I agree that ideally we should ideally remove the 
local files before replaying the changelog topic from scratch since the local 
files state is "unknown".

Regarding the general issue that application is down for longer than the 
tombstone retention period: that is an interesting question, and I think 
generally speaking log compaction should not go beyond the smallest 
corresponding checkpoints (note that there may be different instances fetching 
from the changelog at different time frames due to task migration). I think 
this issue itself worth further discussion on how to resolve it.

> Non-checkpointed state stores should be deleted before restore
> --------------------------------------------------------------
>
>                 Key: KAFKA-5256
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5256
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.1
>            Reporter: Tommy Becker
>
> Currently, Kafka Streams will re-use an existing state store even if there is 
> no checkpoint for it. This seems both inefficient (because duplicate inserts 
> can be made on restore) and incorrect (records which have been deleted from 
> the backing topic may still exist in the store). Since the contents of a 
> store with no checkpoint are unknown, the best way to proceed would be to 
> delete the store and recreate before restoring.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to