[ 
https://issues.apache.org/jira/browse/KAFKA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596965#comment-15596965
 ] 

Jiangjie Qin commented on KAFKA-4099:
-------------------------------------

[~junrao] Thanks for the explanation. I agree that it is reasonable to roll the 
log segment based on create time. However, I have a few concern over using the 
original proposal:
1. It seems the rareness of replica movement is related to scale. e.g. today we 
have over 1800 brokers at LI and 1-2 brokers die every day. So partition 
reassignment almost happen every day. So I think there is a difference between 
"rare at small scale" and "rare regardless of scale". 
2. The incorrect create time does not only happen when partition movement 
occurs. It seems most linux does not have a create time for the files. So the 
create time of a segment would be lost when the brokers are rebooted.

Actually after thinking about the case of oscillating timestamp again, I am not 
sure if that would actually cause frequent log rolling or not. Let's say we 
have two producers one producing messages with current timestamp. The other one 
is producing with timestamps of 7 days old. Assume the current active segment 
is segment 0 and the current time is T. Because the log rolling is based on the 
timestamp of the first message in a log segment, it is possible that the first 
timestamp in segment 0 is 7 days ago (T - 7 days) so once we append a current 
timestamp T, segment 1 is rolled out and its first timestamp will be T, so 
segment 1 won't roll immediately like the previous one, i.e. segment 2 will 
only be rolled out when it sees a timestamp greater than (T + log.roll.ms), and 
so on.

In the above example, it is possible that segment 2 is rolled out because of 
the segment size. In that case, segment 2 may have the first timestamp of (T - 
7days) and segment 3 may get rolled out immediately but segment 3 will again 
wait until either the segment is full or it sees a bigger timestamp that 
triggers the log rolling. So in the worst case, we may roll out two new 
segments in a row. not sure how bad it would be in terms of performance.

Admittedly, if we have some certain timestamp pattern, frequent log rolling may 
still happen. I am curious did you see any real timestamp pattern that has 
caused the frequent log rolling?

> Change the time based log rolling to only based on the message timestamp.
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-4099
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4099
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>             Fix For: 0.10.1.0
>
>
> This is an issue introduced in KAFKA-3163. When partition relocation occurs, 
> the newly created replica may have messages with old timestamp and cause the 
> log segment rolling for each message. The fix is to change the log rolling 
> behavior to only based on the message timestamp when the messages are in 
> message format 0.10.0 or above. If the first message in the segment does not 
> have a timetamp, we will fall back to use the wall clock time for log rolling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to