[ 
https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman updated KAFKA-7994:
---------------------------------------
    Summary: Improve Partition-Time for rebalances and restarts  (was: Improve 
Stream-Time for rebalances and restarts)

> Improve Partition-Time for rebalances and restarts
> --------------------------------------------------
>
>                 Key: KAFKA-7994
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7994
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Matthias J. Sax
>            Assignee: Richard Yu
>            Priority: Major
>             Fix For: 2.4.0
>
>         Attachments: possible-patch.diff
>
>
> We compute a per-partition partition-time as the maximum timestamp over all 
> records processed so far. Before 2.3 this was used to determine the logical 
> stream-time used to make decisions about processing out-of-order records or 
> drop them if they are late (ie, timestamp < stream-time - grace-period). 
> Preserving the stream-time is necessary to ensure deterministic results (see 
> KAFKA-9368), and although the processor-time is now used instead of 
> partition-time, preserving the partition-time is a first step towards 
> improving the overall stream-time semantics.
> The partition-time is also used by the TimestampExtractor. It gets passed in 
> to #extract and can be used to determine a rough timestamp estimate if the 
> actual timestamp is missing, corrupt, etc. This means in the corner case 
> where the next record to be processed after a rebalance/restart cannot have 
> its actual timestamp determined, we have no idea way of coming up with a 
> reasonable guess and the record will likely have to be dropped.
>  
> A potential fix would be, to store latest observed partition-time in the 
> metadata of committed offsets. This way, on restart/rebalance we can 
> re-initialize partition-time correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to