Damian, Thanks for the proposal, I had a few comments on the APIs:
1. Printed#withFile seems not needed, as users should always spec if it is to sysOut or to File at the beginning. In addition as a second thought, I think serdes are not useful for prints anyways since we assume `toString` is provided except for byte arrays, in which we will special handle it. Another comment about Printed in general is it differs with other options that it is a required option than optional one, since it includes toSysOut / toFile specs; what are the pros and cons for including these two in the option and hence make it a required option than leaving them at the API layer and make Printed as optional for mapper / label only? 2.1 KStream#through / to We should have an overloaded function without Produced? 2.2 KStream#groupBy / groupByKey We should have an overloaded function without Serialized? 2.3 KGroupedStream#count / reduce / aggregate We should have an overloaded function without Materialized? 2.4 KStream#join We should have an overloaded function without Joined? 2.5 Each of KTable's operators: We should have an overloaded function without Produced / Serialized / Materialized? 3.1 Produced: the static functions have overlaps, which seems not necessary. I'd suggest jut having the following three static with another three similar member functions: public static <K, V> Produced<K, V> withKeySerde(final Serde<K> keySerde) public static <K, V> Produced<K, V> withValueSerde(final Serde<V> valueSerde) public static <K, V> Produced<K, V> withStreamPartitioner(final StreamPartitioner<K, V> partitioner) The key idea is that by using the same function name string for static constructor and member functions, users do not need to remember what are the differences but can call these functions with any ordering they want, and later calls on the same spec will win over early calls. 3.2 Serialized: similarly public static <K, V> Serialized<K, V> withKeySerde(final Serde<K> keySerde) public static <K, V> Serialized<K, V> withValueSerde(final Serde<V> valueSerde) public Serialized<K, V> withKeySerde(final Serde<K> keySerde) public Serialized<K, V> withValueSerde(final Serde valueSerde) Also it has a final Serde<V> otherValueSerde in one of its static constructor, it that intentional? 3.3. Joined: similarly, keep the static constructor signatures the same as its corresponding member fields. 3.4 Materialized: it is a bit special, and I think we can keep its static constructors with only two `as` as they are today.K 4. Is there any modifications on StateStoreSupplier? Is it replaced by BytesStoreSupplier? Seems some more descriptions are lacking here. Also in public static <K, V, S extends StateStore> Materialized<K, V, S> as(final StateStoreSupplier<S> supplier) Is the parameter in type of BytesStoreSupplier? Guozhang On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy <damian....@gmail.com> wrote: > Updated link: > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+ > use+of+custom+storage+engines > > Thanks, > Damian > > On Thu, 27 Jul 2017 at 13:09 Damian Guy <damian....@gmail.com> wrote: > > > Hi, > > > > I've put together a KIP to make some changes to the KafkaStreams DSL that > > will hopefully allow us to: > > 1) reduce the explosion of overloads > > 2) add new features without having to continue adding more overloads > > 3) provide simpler ways for people to use custom storage engines and wrap > > them with logging, caching etc if desired > > 4) enable per-operator caching rather than global caching without having > > to resort to supplying a StateStoreSupplier when you just want to turn > > caching off. > > > > The KIP is here: > > https://cwiki.apache.org/confluence/pages/viewpage. > action?pageId=73631309 > > > > Thanks, > > Damian > > > -- -- Guozhang