Re: Log segment deletion

2018-01-30 Thread Guozhang Wang
Hi Martin, That is a good point. In fact in the coming release we have made such repartition topics really "transient" by periodically purging it with the embedded admin client, so we can actually set its retention to -1: https://cwiki.apache.org/confluence/display/KAFKA/KIP-220%3A+Add+AdminClien

Re: Log segment deletion

2018-01-30 Thread Martin Kleppmann
Hi Guozhang, Thanks very much for your reply. I am inclined to consider this a bug, since Kafka Streams in the default configuration is likely to run into this problem while reprocessing old messages, and in most cases the problem wouldn't be noticed (since there is no error -- the job just pro

Re: Log segment deletion

2018-01-29 Thread Guozhang Wang
Hello Martin, What you've observed is correct. More generally speaking, for various broker-side operations that based on record timestamps and treating them as wall-clock time, there is a mismatch between the stream records' timestamp which is basically "event time", against the broker's system wa

Re: Log segment deletion

2018-01-29 Thread Martin Kleppmann
Follow-up: I think we figured out what was happening. Setting the broker config log.message.timestamp.type=LogAppendTime (instead of the default value CreateTime) stopped the messages disappearing. The messages in the Streams app's input topic are older than the 24 hours default retention perio