Frank Varnavas created KAFKA-1339:
-------------------------------------

             Summary: Time based offset retrieval seems broken
                 Key: KAFKA-1339
                 URL: https://issues.apache.org/jira/browse/KAFKA-1339
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.8.1
         Environment: Linux
            Reporter: Frank Varnavas
            Priority: Minor


The kafka PartitionOffsetRequest takes a time parameter.  It seems broken to me.

There are two magic values

  -2 returns the oldest  available offset
  -1 returns the newest available offset
  Otherwise the value is time since epoch in millisecs 
(System.currentTimeMillis())

The granularity is limited to the granularity of the log files
These are the log segments for the partition I tested

  Time now is about 17:07
  Time shown is last modify time
  File name has the starting offset number
  You can see that the current one started about 13:40

1073742047 Mar 24 02:52 00000000000004740823.log
1073759588 Mar 24 11:25 00000000000004831581.log
1073782532 Mar 24 16:31 00000000000004916313.log
1073741985 Mar 25 09:11 00000000000005066939.log
1073743756 Mar 25 13:39 00000000000005158529.log
 778424349 Mar 25 17:07 00000000000005214225.log

The below shows the returned offset for an input time = (current time - [0..23] 
hours)
Even 1 second less than the current time returns the previous segment, even 
though that segment ended 2.5 hours earlier.

I think the result is off by 1 log segment. i.e. offset 1-3 should have been 
from 5214225, 4-7 should have been from 5158529

0 -> 5214225
1 -> 5158529
2 -> 5158529
3 -> 5158529
4 -> 5066939
5 -> 5066939
6 -> 5066939
7 -> 5066939
8 -> 4973490
9 -> 4973490
10 -> 4973490
11 -> 4973490
12 -> 4973490
13 -> 4973490
14 -> 4973490
15 -> 4973490
16 -> 4916313
17 -> 4916313
18 -> 4916313
19 -> 4916313
20 -> 4916313
21 -> 4916313
22 -> 4916313
23 -> 4916313




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to