Elias Levy created KAFKA-4683: --------------------------------- Summary: Mismatch between Stream windowed store and broker log retention logic Key: KAFKA-4683 URL: https://issues.apache.org/jira/browse/KAFKA-4683 Project: Kafka Issue Type: Bug Components: log, streams Affects Versions: 0.10.1.1 Reporter: Elias Levy
The RocksDBWindowStore keeps key-value entries for a configurable retention period. The leading edge of the time period kept is determined the newest timestamp of an inserted KV. The trailing edge is this leading edge minus the requested retention period. If logging is enabled, changes to the store are written to a change log topic that is configured with a retention.ms value equal to the store retention period. The leading edge of the time period kept by the log is the current time. The trailing edge is the leading edge minus the requested retention period. The difference on how the leading edge is determined can result in unexpected behavior. If the stream application is processing data older than the retention period and storing it in a windowed store, the store will have data for the retention period looking back from the newest timestamp of the processed message. But the messages written to the state changeling will almost immediately be deleted by the broker, as they will fall outside of the retention window as it computes it. If the application is stopped and restarted in this state, and if the local state has been lost of some reason, the application won't be able to recover the sate from the broker, as it broker has deleted it. In addition, I've noticed that there is a discrepancy on what timestamp is used between the store and the change log. The store will use the timestamp passed as an argument to {{put}}, or if no timestamp is passed, fallback to {{context.timestamp}}. But {{StoreChangeLogger.logChange}} does not take a timestamp. Instead is always uses {{context.timestamp}} to write the change to the broker. Thus it is possible that the state store and the change log to use different timestamps for the same KV. -- This message was sent by Atlassian JIRA (v6.3.4#6332)