[ 
https://issues.apache.org/jira/browse/KAFKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658960#comment-15658960
 ] 

Jiangjie Qin commented on KAFKA-4398:
-------------------------------------

I see, you are saying that the broker should deliver the message based on the 
order of timestamp, right? This is essentially requiring Kafka to behave like a 
distributed priority queue, which is not what it was designed for to begin 
with. And this may not even be feasible because that means the broker has to 
either scan the entire log for each read or it has to index each message based 
on the timestamp order and jump between different offsets all the time. The 
throughput may be almost close to 0. And any future insertion of an earlier 
timestamp would change the log order. Although I agree that it would be good to 
have such a product as a distributed priority queue, but I do not see how Kafka 
could support that. 

> offsetsForTimes returns false starting offset when timestamp of messages are 
> not monotonically increasing
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4398
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4398
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, core
>    Affects Versions: 0.10.1.0
>            Reporter: huxi
>            Assignee: huxi
>
> After a code walk-through for KIP-33(Add a time based log index), I found a 
> use case where method 'offsetsForTimes' fails to return the correct offset if 
> a series of messages are created without the monotonically increasing 
> timestamps (CreateTime is used)
> Say T0 is the hour when the first message is created. Tn means the (T+n)th 
> hour. Then, I created another two messages at T1 and T3 respectively. At this 
> moment, the <baseoffset>.timeindex should contain two items:
> T1 --->  1
> T3 ----> 2  (whether it contains T0 does not matter to this problem)
> Later, due to some reason, I want to insert a third message in between T1 and 
> T3, say T2.5, but the time index file got no changed because of the limit 
> that timestamp should be monotonically increasing for each segment.
> After generating message with T2.5, I invoke 
> KafkaConsumer.offsetsForTimes("tp" -> T2.5), hoping to get the first offset 
> with timestamp greater or equal to T2.5 which should be the third message in 
> this case, but consumer returns the second message with T3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to