[
https://issues.apache.org/jira/browse/KAFKA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175074#comment-17175074
]
John Roesler commented on KAFKA-10383:
--------------------------------------
Thanks for the report, [~marcolotz].
This seems like a design oversight. It does seem desirable to plug in different
stores as the subscription store.
I'm not sure if I'd piggy-back on the existing Materialized argument, as the
subscription state would have a completely different shape and dynamic from the
join result (which is what Materialized configures). Plus, you may want to (eg)
set the subscription state to in-memory without materializing the join result.
If we piggy-back, there would be no way to express this.
At a glance, it seems like we should have a separate argument to the join,
which would be a new object allowing to configure the things that make sense
for a subscription store:
* KeyValueBytesStoreSupplier: the kind of store to use
* {color:#00627a}withLoggingEnabled{color}({color:#0033b3}final
{color}{color:#000000}Map{color}<{color:#000000}String{color},
{color:#000000}String{color}> config) / withLoggingDisabled(): the changelog
configs
* withCachingEnabled() / withCachingDisabled(): the caching configs
This would require a KIP, of course. Are you open to contributing this feature?
I think a lot of people would find it helpful as the feature becomes more
popular. I'd be happy to help you with the process if you're willing.
Thanks,
-John
> KTable Join on Foreign key is opinionated
> ------------------------------------------
>
> Key: KAFKA-10383
> URL: https://issues.apache.org/jira/browse/KAFKA-10383
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Affects Versions: 2.4.1
> Reporter: Marco Lotz
> Priority: Major
>
> *Status Quo:*
> The current implementation of [KIP-213
> |[https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]]
> of Foreign Key Join between two KTables is _opinionated_ in terms of storage
> layer.
> Independently of the Materialization method provided in the method argument,
> it generates an intermediary RocksDB state store. Thus, even when the
> Materialization method provided is "in memory", it will use RocksDB
> under-the-hood for this internal state-store.
>
> *Related problems:*
> * IT Tests: Having an implicit materialization method for state-store
> affects tests using foreign key state-stores. [On windows based systems
> |[https://stackoverflow.com/questions/50602512/failed-to-delete-the-state-directory-in-ide-for-kafka-stream-application]],
> that are affected by the RocksDB filesystem removal problem, an approach to
> avoid the bug is to use in-memory state-stores (rather than exception
> swallowing). Having the intermediate RocksDB storage being created
> disregarding materialization method forces any IT test to necessarily use the
> manual FS deletion with exception swallowing hack.
> * Short lived Streams: Ktables can be short lived in a way that neither
> persistent storage nor change-logs creation are desired. The current
> implementation prevents this.
> *Suggestion:*
> One possible solution is to use a similar materialization method (to the one
> provided in the argument) when creating the intermediary Foreign Key
> state-store. If the Materialization is in memory and without changelog, the
> same happens in the intermediate state-sore.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)