[ 
https://issues.apache.org/jira/browse/KAFKA-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169045#comment-17169045
 ] 

Sophie Blee-Goldman commented on KAFKA-10137:
---------------------------------------------

Yeah I thought that might be the case, although I'm not sure I agree with the 
reasoning behind it. You shouldn't be switching `retainDuplicates` on and off 
for an existing store; the changelog bytes should match the local store format. 

Maybe the theory was that users might try to build two stores off of the same 
changelog, one with duplicates and one without. I don't think we should support 
that either.

On the other hand it's reasonable to write the output of a windowed aggregation 
to a topic and then use that as the source topic for a table/store/global store 
with or without duplicates. Unfortunately that is exactly [the 
case|https://issues.apache.org/jira/browse/KAFKA-10322] which is broken by this 
"bug" (whether intentional or not)

> Clean-up retain Duplicate logic in Window Stores
> ------------------------------------------------
>
>                 Key: KAFKA-10137
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10137
>             Project: Kafka
>          Issue Type: Task
>          Components: streams
>    Affects Versions: 2.5.0
>            Reporter: Bruno Cadonna
>            Priority: Minor
>
> Stream-stream joins use the regular `WindowStore` implementation but with 
> `retainDuplicates` set to true. To allow for duplicates while using the same 
> unique-key underlying stores we just wrap the key with an incrementing 
> sequence number before inserting it.
> The logic to maintain and append the sequence number is present in multiple 
> locations, namely in the changelogging window store and in its underlying 
> window stores. We should consolidate this code to one single location.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to