[ https://issues.apache.org/jira/browse/KAFKA-10847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273421#comment-17273421 ]
Guozhang Wang commented on KAFKA-10847: --------------------------------------- Also, I'm wondering if we have considered if we should use the additional store to temporarily hold records that do not have a join, or use that store to hold records that DO have a join result that is already emitted. With latter approach, we can still emit results if found a match immediately, but the tradeoff may be that in practice, the likelihood a new record would find a matching record from the other side right upon reception could be higher than the likelihood that no match is found, and hence the size of that store would be larger. > Avoid spurious left/outer join results in stream-stream join > ------------------------------------------------------------- > > Key: KAFKA-10847 > URL: https://issues.apache.org/jira/browse/KAFKA-10847 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Assignee: Sergio Peña > Priority: Major > > KafkaStreams follows an eager execution model, ie, it never buffers input > records but processes them right away. For left/outer stream-stream join, > this implies that left/outer join result might be emitted before the window > end (or window close) time is reached. Thus, a record what will be an > inner-join result, might produce a eager (and spurious) left/outer join > result. > We should change the implementation of the join, to not emit eager left/outer > join result, but instead delay the emission of such result after the window > grace period passed. -- This message was sent by Atlassian Jira (v8.3.4#803005)