[ https://issues.apache.org/jira/browse/KAFKA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447253#comment-15447253 ]
Jiangjie Qin commented on KAFKA-4099: ------------------------------------- [~junrao] I am thinking about this solution. It seems still not ideal. For some low volume topics, if we roll the log based on the segment create time, during partition relocation, we may keep the sensitive data for much longer than we wanted to - because all the data may be end up in the same segment and the old data cannot be deleted because they are still with the new data. It seems the root cause of the unnecessary log rolling is that we are comparing the timestamp in the message and the wall clock time. This caused the log rolling to become wall clock time sensitive. I am thinking may be we should always use the timestamp in the message. i.e. we roll out the log segment if the timestamp in the current message is greater than the timestamp of the first message in the segment by more than log.roll.ms. This approach is wall clock independent and should solve the problem. With message.timestamp.difference.max.ms configuration, we can achieve 1) the log segment will be rolled out in a bounded time, 2) no excessively large timestamp will be accepted and cause frequent log rolling. What do you think? > Change the time based log rolling to base on the file create time instead of > timestamp of the first message. > ------------------------------------------------------------------------------------------------------------ > > Key: KAFKA-4099 > URL: https://issues.apache.org/jira/browse/KAFKA-4099 > Project: Kafka > Issue Type: Bug > Components: core > Reporter: Jiangjie Qin > Assignee: Jiangjie Qin > Fix For: 0.10.1.0 > > > This is an issue introduced in KAFKA-3163. When partition relocation occurs, > the newly created replica may have messages with old timestamp and cause the > log segment rolling for each message. The fix is to change the log rolling > behavior back to based on segment create time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)