guozhangwang commented on pull request #11242: URL: https://github.com/apache/kafka/pull/11242#issuecomment-914789772
Thanks for the POC @showuon ! I made a pass and here are some high-level thoughts: 1) On the PAPI layer, users today can only add stores via Topology#addStateStore in which a `StoreBuilder` is required, and from the public APIs they are most likely going to create one from the `Stores` factory. Note that `KeyValueStoreBuilder` is not a public API and hence cannot be really used by PAPI users. In other words, in PAPI today users would have to specify the store types explicitly via the `Topology#addStateStore` API anyways, and that's also why we only consider this KIP for DSL originally. If we want to benefit the PAPI users, we would probably need to add new functions in the `Stores` factory which would not specify if it is persistent or in-memory, but only the type (kv, window, etc). That would require a larger KIP and maybe we can exclude that for this one. 2) I personally would prefer as less public API changes/additions as possible for KIPs, since it's always easier to add later when we found it's not sufficient, than add first and regret later :) Plus with more public APIs it is always getting clumsier. For that, my original thought would be, from user's perspective, if `Materialized` construct does not have a supplier set, then Streams would use the supplier as configured by the config. I.e. we should work with all thee cases below: ``` // explicitly set the store stream.groupByKey().count(Materialized.as(Stores.inMemoryKeyValueStore("some-name")); // only set the store type stream.groupByKey().count(Materialized.as(StoreType.IN_MEMORY)); // do not set the store type at all, rely on the config stream.groupByKey().count(Materialized.as("some-name")); ``` I understand that today the config is not available yet during the DSL's StreamsBuilder phase, but only during the build() phase. What I was thinking is that during the StreamsBuidler's topology-construction phase, e.g. inside `materialize()` function, instead of ``` if (supplier == null) { final String name = materialized.storeName(); supplier = Stores.persistentTimestampedKeyValueStore(name); } ``` That means, for the third case above, we would need some "placeholder" StoreSupplier as it is "to be decided", and then during `buildAndOptimizeTopology` we would fill-in the placeholders based on the configs. For the other two cases, we would derive the supplier immediately. With that we would not need the `StoreImplementation` interface. In general I'm leaning towards we, as the developers of KS, eat more complexity bullets so that we can leave the public interface as succinct as possible to the users. As for passing in the config at the `StreamsBuilder` constructor instead of the build phase, I think that would work too, and would be much simpler, but again it would change the public API pattern as well which makes the `build(properties)` function deprecated, and the other deprecated `build()` back to normal? Maybe cc @ableegoldman as well since she's working on another project which would benefit if we pass in the config at the constructor phase). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org