Sergio Peña created KAFKA-13216:
-----------------------------------

             Summary: Streams left/outer joins cause new internal changelog 
topic to grow unbounded
                 Key: KAFKA-13216
                 URL: https://issues.apache.org/jira/browse/KAFKA-13216
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 3.0.0
            Reporter: Sergio Peña


This bug is caused by the improvements made in 
https://issues.apache.org/jira/browse/KAFKA-10847, which fixes an issue with 
stream-stream left/outer joins. The issue is only caused when a stream-stream 
left/outer join is used with the new `JoinWindows.ofTimeDifferenceAndGrace()` 
API that specifies the window time + grace period. This new API was added in AK 
3.0. No previous users are affected.

The issue causes that the internal changelog topic used by the new OUTERSHARED 
window store keeps growing unbounded as new records come. The topic is never 
cleaned up nor compacted even if tombstones are written to delete the joined 
and/or expired records from the window store. The problem is caused by a 
parameter required in the window store to retain duplicates. This config causes 
that tombstones records have a new sequence ID as part of the key ID in the 
changelog making those keys unique. Thus causing the cleanup policy not working.

In 3.0, we deprecated {{JoinWindows.of(size)}} in favor of 
{{JoinWindows.ofTimeDifferenceAndGrace()}} -- the old API uses the old 
semantics and is thus not affected while the new API enable the new semantics; 
the problem is that we deprecated the old API and thus tell users that they 
should switch to the new broken API.

We have two ways forward:
 * Fix the bug (non trivial)
 * Un-deprecate the old {{JoinWindow.of(size)}} API (and tell users not to use 
the new but broken API)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to