guozhangwang commented on pull request #11242:
URL: https://github.com/apache/kafka/pull/11242#issuecomment-914789772


   Thanks for the POC @showuon ! I made a pass and here are some high-level 
thoughts:
   
   1) On the PAPI layer, users today can only add stores via 
Topology#addStateStore in which a `StoreBuilder` is required, and from the 
public APIs they are most likely going to create one from the `Stores` factory. 
Note that `KeyValueStoreBuilder` is not a public API and hence cannot be really 
used by PAPI users. In other words, in PAPI today users would have to specify 
the store types explicitly via the `Topology#addStateStore` API anyways, and 
that's also why we only consider this KIP for DSL originally. If we want to 
benefit the PAPI users, we would probably need to add new functions in the 
`Stores` factory which would not specify if it is persistent or in-memory, but 
only the type (kv, window, etc). That would require a larger KIP and maybe we 
can exclude that for this one.
   
   2) I personally would prefer as less public API changes/additions as 
possible for KIPs, since it's always easier to add later when we found it's not 
sufficient, than add first and regret later :) Plus with more public APIs it is 
always getting clumsier. For that, my original thought would be, from user's 
perspective, if `Materialized` construct does not have a supplier set, then 
Streams would use the supplier as configured by the config.
   
   I.e. we should work with all thee cases below:
   
   ```
   // explicitly set the store
   
stream.groupByKey().count(Materialized.as(Stores.inMemoryKeyValueStore("some-name"));
    
   // only set the store type
   stream.groupByKey().count(Materialized.as(StoreType.IN_MEMORY));
   
   // do not set the store type at all, rely on the config
   stream.groupByKey().count(Materialized.as("some-name"));
   ```
   
   I understand that today the config is not available yet during the DSL's 
StreamsBuilder phase, but only during the build() phase. What I was thinking is 
that during the StreamsBuidler's topology-construction phase, e.g. inside 
`materialize()` function, instead of 
   
   ```
           if (supplier == null) {
               final String name = materialized.storeName();
               supplier = Stores.persistentTimestampedKeyValueStore(name);
           }
   ```
   
   That means, for the third case above, we would need some "placeholder" 
StoreSupplier as it is "to be decided", and then during 
`buildAndOptimizeTopology` we would fill-in the placeholders based on the 
configs. For the other two cases, we would derive the supplier immediately. 
With that we would not need the `StoreImplementation` interface. 
   
   In general I'm leaning towards we, as the developers of KS, eat more 
complexity bullets so that we can leave the public interface as succinct as 
possible to the users. As for passing in the config at the `StreamsBuilder` 
constructor instead of the build phase, I think that would work too, and would 
be much simpler, but again it would change the public API pattern as well which 
makes the `build(properties)` function deprecated, and the other deprecated 
`build()` back to normal? Maybe cc @ableegoldman as well since she's working on 
another project which would benefit if we pass in the config at the constructor 
phase).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to