[ 
https://issues.apache.org/jira/browse/KAFKA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17473149#comment-17473149
 ] 

Matthias J. Sax commented on KAFKA-13289:
-----------------------------------------

{quote}The app uses a 0ms join window with a 0ms
{quote}
That is pretty aggressive and will result in a retention of 0ms, too. Thus, 
every out-of-order record will be considered late and would be dropped (and you 
would see the corresponding logging).
{quote}Here's a question - could a large difference between message timestamp 
(used for joining) and system time be causing the "Skipping record" messages?
{quote}
It should not. System time should be totally out of the picture.
{quote}I (accidentally) noticed that one way to reproduce "Skipping record for 
expired segment" messages seems to be adding messages to the input topics with 
older timestamps than those of earlier messages, i.e. going back in time.
{quote}
That is expected behavior. – Not sure what you mean by "would not mind". Kafka 
Streams can process out-of-order data, but of course if will drop data that is 
older than the configured retention period (for this case, you see the 
corresponding logs).

> Bulk processing correctly ordered input data through a join with 
> kafka-streams results in `Skipping record for expired segment`
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13289
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13289
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.8.0
>            Reporter: Matthew Sheppard
>            Priority: Minor
>
> When pushing bulk data through a kafka-steams app, I see it log the following 
> message many times...
> {noformat}
> WARN 
> org.apache.kafka.streams.state.internals.AbstractRocksDBSegmentedBytesStore - 
> Skipping record for expired segment.
> {noformat}
> ...and data which I expect to have been joined through a leftJoin step 
> appears to be lost.
> I've seen this in practice either when my application has been shut down for 
> a while and then is brought back up, or when I've used something like the 
> [app-reset-rool](https://docs.confluent.io/platform/current/streams/developer-guide/app-reset-tool.html)
>  in an attempt to have the application reprocess past data.
> I was able to reproduce this behaviour in isolation by generating 1000 
> messages to two topics spaced an hour apart (with the original timestamps in 
> order), then having kafka streams select a key for them and try to leftJoin 
> the two rekeyed streams.
> Self contained source code for that reproduction is available at 
> https://github.com/mattsheppard/ins14809/blob/main/src/test/java/ins14809/Ins14809Test.java
> The actual kafka-streams topology in there looks like this.
> {code:java}
>             final StreamsBuilder builder = new StreamsBuilder();
>             final KStream<String, String> leftStream = 
> builder.stream(leftTopic);
>             final KStream<String, String> rightStream = 
> builder.stream(rightTopic);
>             final KStream<String, String> rekeyedLeftStream = leftStream
>                     .selectKey((k, v) -> v.substring(0, v.indexOf(":")));
>             final KStream<String, String> rekeyedRightStream = rightStream
>                     .selectKey((k, v) -> v.substring(0, v.indexOf(":")));
>             JoinWindows joinWindow = JoinWindows.of(Duration.ofSeconds(5));
>             final KStream<String, String> joined = rekeyedLeftStream.leftJoin(
>                     rekeyedRightStream,
>                     (left, right) -> left + "/" + right,
>                     joinWindow
>             );
> {code}
> ...and the eventual output I produce looks like this...
> {code}
> ...
> 523 [523,left/null]
> 524 [524,left/null, 524,left/524,right]
> 525 [525,left/525,right]
> 526 [526,left/null]
> 527 [527,left/null]
> 528 [528,left/528,right]
> 529 [529,left/null]
> 530 [530,left/null]
> 531 [531,left/null, 531,left/531,right]
> 532 [532,left/null]
> 533 [533,left/null]
> 534 [534,left/null, 534,left/534,right]
> 535 [535,left/null]
> 536 [536,left/null]
> 537 [537,left/null, 537,left/537,right]
> 538 [538,left/null]
> 539 [539,left/null]
> 540 [540,left/null]
> 541 [541,left/null]
> 542 [542,left/null]
> 543 [543,left/null]
> ...
> {code}
> ...where as, given the input data, I expect to see every row end with the two 
> values joined, rather than the right value being null.
> Note that I understand it's expected that we initially get the left/null 
> values for many values since that's the expected semantics of kafka-streams 
> left join, at least until 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Join+Semantics#KafkaStreamsJoinSemantics-ImprovedLeft/OuterStream-StreamJoin(v3.1.xandnewer)spurious
> I've noticed that if I set a very large grace value on the join window the 
> problem is solved, but since the input I provide is not out of order I did 
> not expect to need to do that, and I'm weary of the resource requirements 
> doing so in practice on an application with a lot of volume.
> My suspicion is that something is happening such that when one partition is 
> processed it causes the stream time to be pushed forward to the newest 
> message in that partition, meaning when the next partition is then examined 
> it is found to contain many records which are 'too old' compared to the 
> stream time. 
> I ran across this discussion thread which seems to cover the same issue 
> http://mail-archives.apache.org/mod_mbox/kafka-users/202002.mbox/%3cCAB0tB9p_vijMS18jWXBqp7TQozL__ANoo3=h57q6z3y4hzt...@mail.gmail.com%3e
>  and had a request from [~cadonna] for a reproduction case, so I'm hoping my 
> example above might make the issue easier to tackle!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to