Frank Varnavas created KAFKA-1339: ------------------------------------- Summary: Time based offset retrieval seems broken Key: KAFKA-1339 URL: https://issues.apache.org/jira/browse/KAFKA-1339 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1 Environment: Linux Reporter: Frank Varnavas Priority: Minor
The kafka PartitionOffsetRequest takes a time parameter. It seems broken to me. There are two magic values -2 returns the oldest available offset -1 returns the newest available offset Otherwise the value is time since epoch in millisecs (System.currentTimeMillis()) The granularity is limited to the granularity of the log files These are the log segments for the partition I tested Time now is about 17:07 Time shown is last modify time File name has the starting offset number You can see that the current one started about 13:40 1073742047 Mar 24 02:52 00000000000004740823.log 1073759588 Mar 24 11:25 00000000000004831581.log 1073782532 Mar 24 16:31 00000000000004916313.log 1073741985 Mar 25 09:11 00000000000005066939.log 1073743756 Mar 25 13:39 00000000000005158529.log 778424349 Mar 25 17:07 00000000000005214225.log The below shows the returned offset for an input time = (current time - [0..23] hours) Even 1 second less than the current time returns the previous segment, even though that segment ended 2.5 hours earlier. I think the result is off by 1 log segment. i.e. offset 1-3 should have been from 5214225, 4-7 should have been from 5158529 0 -> 5214225 1 -> 5158529 2 -> 5158529 3 -> 5158529 4 -> 5066939 5 -> 5066939 6 -> 5066939 7 -> 5066939 8 -> 4973490 9 -> 4973490 10 -> 4973490 11 -> 4973490 12 -> 4973490 13 -> 4973490 14 -> 4973490 15 -> 4973490 16 -> 4916313 17 -> 4916313 18 -> 4916313 19 -> 4916313 20 -> 4916313 21 -> 4916313 22 -> 4916313 23 -> 4916313 -- This message was sent by Atlassian JIRA (v6.2#6252)