Hi Bill, Thanks for the KIP! Awesome job catching this unexpected consequence of the prior KIPs before it was released.
The proposal looks good to me. On top of just fixing the problem, it seems to address two other pain points: * that naming a state store automatically causes it to become queriable. * that there's currently no way to configure the bytes store for join windows. It's awesome that we can fix this issue and two others with one feature. I'm wondering about a missing quadrant from the truth table involving whether a Materialized is stored or not and querying is enabled/disabled... What should be the behavior if there is no store configured (e.g., if Materialized with only serdes) and querying is enabled? It seems we have two choices: 1. we can force creation of a state store in this case, so the store can be used to serve the queries 2. we can provide just a queriable view, basically letting IQ query into the "KTableValueGetter", which would transparently construct the query response by applying the operator logic to the upstream state if the operator state isn't already stored. Offhand, it seems like the second is actually a pretty awesome capability. But it might have an awkward interaction with the current semantics. Presently, if I provide a Materialized.withName, it implies that querying should be enabled AND that the view should actually be stored in a state store. Under option 2 above, this behavior would change to NOT provision a state store and instead just consult the ValueGetter. To get back to the current behavior, users would have to add a "bytes store supplier" to the Materialized to indicate that, yes, they really want a state store there. Behavior changes are always kind of scary, but I think in this case, it might actually be preferable. In the event where only the name is provided, it means that people just wanted to make the operation result queriable. If we automatically convert this to a non-stored view, then simply upgrading results in the same observable behavior and semantics, but a linear reduction in local storage requirements and disk i/o, as well as a corresponding linear reduction in memory usage both on and off heap. What do you think? -John On Tue, Jun 18, 2019 at 9:21 PM Bill Bejeck <bbej...@gmail.com> wrote: > > All, > > I'd like to start a discussion for adding a Materialized configuration > object to KStream.join for naming state stores involved in joins. > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-479%3A+Add+Materialized+to+Join > > Your comments and suggestions are welcome. > > Thanks, > Bill