[ https://issues.apache.org/jira/browse/KAFKA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888543#comment-16888543 ]
Raman Gupta commented on KAFKA-7190: ------------------------------------ I want to mention another case in which this happened. I'm not sure it has been discussed above, but as the discussion was quite technical, perhaps I am wrong about that. If this makes sense to put into a separate issue, let me know. In any case, the situation is 1. a topic with messages that are over 7 days old. 2. a stream that transforms messages on that topic, and writes back different messages to the same topic (though I suspect that doesn't matter, it could be any topic). 3. Writes to the topic get `UnknownProducerIdException` The default for Streams is to write the transformed record with the same timestamp as the input record. The producer id being deleted seems to be based on the timestamp of that transformed record, which is more than 7 days old, even though the record was actually written *right now*. however, it seems very very wrong to delete a producer id that was just created, just because the producer with that id happened to produce a message with an old timestamp. Why not just track when the producer id was last used, and then garbage collect it based on that? The workaround in this case is to use a transformer to set the produced record timestamp. > Under low traffic conditions purging repartition topics cause WARN statements > about UNKNOWN_PRODUCER_ID > --------------------------------------------------------------------------------------------------------- > > Key: KAFKA-7190 > URL: https://issues.apache.org/jira/browse/KAFKA-7190 > Project: Kafka > Issue Type: Improvement > Components: core, streams > Affects Versions: 1.1.0, 1.1.1 > Reporter: Bill Bejeck > Assignee: Guozhang Wang > Priority: Major > > When a streams application has little traffic, then it is possible that > consumer purging would delete > even the last message sent by a producer (i.e., all the messages sent by > this producer have been consumed and committed), and as a result, the broker > would delete that producer's ID. The next time when this producer tries to > send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case, > this error is retriable: the producer would just get a new producer id and > retries, and then this time it will succeed. > > Possible fixes could be on the broker side, i.e., delaying the deletion of > the produderIDs for a more extended period or on the streams side developing > a more conservative approach to deleting offsets from repartition topics > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)