Thanks to the update Damian!
Couple of comments: KStream: leftJoin and outerJoin for KStream/KTable join should not have `JoinWindows` parameter Nit: TopologyBuilder -> Topology Nit: new class Serialized list static method #with twice WindowedKStream -> for consistency we should either have GroupedKStream or KWindowedStream... (similar argument for SessionWindowedKStream) KGroupedStream -> why do we use a different name for `sessionWindowedBy()` -- seems to be cleaner to call both methods `windowedBy()` StreamsBuilder#stream -> parameter order is confusing... We should have Pattern as second parameter to align both methods. StreamsBuilder#table/globalTable -> move parameter `Consumed` as first parameter to align with `#stream` Produced#with(Serde, Serde) Produced#with(StreamPartitioner, Serde, Serde) -> should StreamPartitioner be the third argument instead of the first? Consumed: Why do we need 3 different names for the 3 static methods? I would all of them just call `with()`. Current names sound clumsy to me. And a plain `with()` also aligns with the naming of static methods of other classes. I guess we are also deprecation a bunch of method for KStream/KTable/KGroupedStream/KGroupedTable and should mention which one? There is just one sentence "Deprecate the existing overloads.", but we don't deprecate all existing once. I personally don't care to much if we spell deprecated method out explicitly, but right now it's not consistent as we only list methods we add. Should we deprecate `StateStoreSupplier`? -Matthias On 8/22/17 6:55 AM, Damian Guy wrote: > I've just updated the KIP with some additional changes targeted at > StreamsBuilder > > Thanks, > Damian > > On Thu, 10 Aug 2017 at 12:59 Damian Guy <damian....@gmail.com> wrote: > >> >>> Got it, thanks. >>> >>> Does it still make sense to have one static constructors for each spec, >>> with one constructor having only one parameter to make it more usable, >>> i.e. >>> as a user I do not need to give all parameters if I only want to override >>> one of them? Maybe we can just name the constructors as `with` but I'm not >>> sure if Java distinguish: >>> >>> public static <K, V> Produced<K, V> with(final Serde<K> keySerde) >>> public static <K, V> Produced<K, V> with(final Serde<V> valueSerde) >>> >>> as two function signatures. >>> >>> >> No that won't work. That is why we have all options, i.e., on Produce >> public static <K, V> Produced<K, V> with(final Serde<K> keySerde, final >> Serde<V> >> valueSerde) >> public static <K, V> Produced<K, V> with(final StreamPartitioner<K, V> >> partitioner, final Serde<K> keySerde, final Serde<V> valueSerde) >> public static <K, V> Produced<K, V> keySerde(final Serde<K> keySerde) >> public static <K, V> Produced<K, V> valueSerde(final Serde<V> valueSerde) >> public static <K, V> Produced<K, V> streamPartitioner(final >> StreamPartitioner<K, >> V> partitioner) >> >> So if you only want to use one you can just use the function that takes >> one argument. >> >>> >>> Guozhang >>> >>> >>> On Wed, Aug 9, 2017 at 6:20 AM, Damian Guy <damian....@gmail.com> wrote: >>> >>>> On Tue, 8 Aug 2017 at 20:11 Guozhang Wang <wangg...@gmail.com> wrote: >>>> >>>>> Damian, >>>>> >>>>> Thanks for the proposal, I had a few comments on the APIs: >>>>> >>>>> 1. Printed#withFile seems not needed, as users should always spec if >>> it >>>> is >>>>> to sysOut or to File at the beginning. In addition as a second >>> thought, I >>>>> think serdes are not useful for prints anyways since we assume >>> `toString` >>>>> is provided except for byte arrays, in which we will special handle >>> it. >>>>> >>>>> >>>> +1 >>>> >>>> >>>>> Another comment about Printed in general is it differs with other >>> options >>>>> that it is a required option than optional one, since it includes >>>> toSysOut >>>>> / toFile specs; what are the pros and cons for including these two in >>> the >>>>> option and hence make it a required option than leaving them at the >>> API >>>>> layer and make Printed as optional for mapper / label only? >>>>> >>>>> >>>> It isn't required as we will still have the no-arg print() which will >>> just >>>> go to sysout as it does now. >>>> >>>> >>>>> >>>>> 2.1 KStream#through / to >>>>> >>>>> We should have an overloaded function without Produced? >>>>> >>>> >>>> Yes - we already have those so they are not part of the KIP, i.e, >>>> through(topic) >>>> >>>> >>>>> >>>>> 2.2 KStream#groupBy / groupByKey >>>>> >>>>> We should have an overloaded function without Serialized? >>>>> >>>> >>>> Yes, as above >>>> >>>>> >>>>> 2.3 KGroupedStream#count / reduce / aggregate >>>>> >>>>> We should have an overloaded function without Materialized? >>>>> >>>> >>>> As above >>>> >>>>> >>>>> 2.4 KStream#join >>>>> >>>>> We should have an overloaded function without Joined? >>>>> >>>> >>>> as above >>>> >>>>> >>>>> >>>>> 2.5 Each of KTable's operators: >>>>> >>>>> We should have an overloaded function without Produced / Serialized / >>>>> Materialized? >>>>> >>>>> >>>> as above >>>> >>>> >>>>> >>>>> >>>>> 3.1 Produced: the static functions have overlaps, which seems not >>>>> necessary. I'd suggest jut having the following three static with >>> another >>>>> three similar member functions: >>>>> >>>>> public static <K, V> Produced<K, V> withKeySerde(final Serde<K> >>> keySerde) >>>>> >>>>> public static <K, V> Produced<K, V> withValueSerde(final Serde<V> >>>>> valueSerde) >>>>> >>>>> public static <K, V> Produced<K, V> withStreamPartitioner(final >>>>> StreamPartitioner<K, V> partitioner) >>>>> >>>>> The key idea is that by using the same function name string for static >>>>> constructor and member functions, users do not need to remember what >>> are >>>>> the differences but can call these functions with any ordering they >>> want, >>>>> and later calls on the same spec will win over early calls. >>>>> >>>>> >>>> That would be great if java supported it, but it doesn't. You can't have >>>> static an member functions with the same signature. >>>> >>>> >>>>> >>>>> 3.2 Serialized: similarly >>>>> >>>>> public static <K, V> Serialized<K, V> withKeySerde(final Serde<K> >>>> keySerde) >>>>> >>>>> public static <K, V> Serialized<K, V> withValueSerde(final Serde<V> >>>>> valueSerde) >>>>> >>>>> public Serialized<K, V> withKeySerde(final Serde<K> keySerde) >>>>> >>>>> public Serialized<K, V> withValueSerde(final Serde valueSerde) >>>>> >>>> >>>> as above >>>> >>>> >>>>> >>>>> Also it has a final Serde<V> otherValueSerde in one of its static >>>>> constructor, it that intentional? >>>>> >>>> >>>> Nope: thanks. >>>> >>>>> >>>>> 3.3. Joined: similarly, keep the static constructor signatures the >>> same >>>> as >>>>> its corresponding member fields. >>>>> >>>>> >>>> As above >>>> >>>> >>>>> 3.4 Materialized: it is a bit special, and I think we can keep its >>> static >>>>> constructors with only two `as` as they are today.K >>>>> >>>>> >>>> 4. Is there any modifications on StateStoreSupplier? Is it replaced by >>>>> BytesStoreSupplier? Seems some more descriptions are lacking here. >>> Also >>>> in >>>>> >>>>> >>>> No modifications to StateStoreSupplier. It is superseceded by >>>> BytesStoreSupplier. >>>> >>>> >>>> >>>>> public static <K, V, S extends StateStore> Materialized<K, V, S> >>>>> as(final StateStoreSupplier<S> >>>>> supplier) >>>>> >>>>> Is the parameter in type of BytesStoreSupplier? >>>>> >>>> >>>> Yep - thanks >>>> >>>> >>>>> >>>>> >>>>> >>>>> >>>>> Guozhang >>>>> >>>>> >>>>> On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy <damian....@gmail.com> >>>> wrote: >>>>> >>>>>> Updated link: >>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>> 182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+ >>>>>> use+of+custom+storage+engines >>>>>> >>>>>> Thanks, >>>>>> Damian >>>>>> >>>>>> On Thu, 27 Jul 2017 at 13:09 Damian Guy <damian....@gmail.com> >>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've put together a KIP to make some changes to the KafkaStreams >>> DSL >>>>> that >>>>>>> will hopefully allow us to: >>>>>>> 1) reduce the explosion of overloads >>>>>>> 2) add new features without having to continue adding more >>> overloads >>>>>>> 3) provide simpler ways for people to use custom storage engines >>> and >>>>> wrap >>>>>>> them with logging, caching etc if desired >>>>>>> 4) enable per-operator caching rather than global caching without >>>>> having >>>>>>> to resort to supplying a StateStoreSupplier when you just want to >>>> turn >>>>>>> caching off. >>>>>>> >>>>>>> The KIP is here: >>>>>>> https://cwiki.apache.org/confluence/pages/viewpage. >>>>>> action?pageId=73631309 >>>>>>> >>>>>>> Thanks, >>>>>>> Damian >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- Guozhang >>>>> >>>> >>> >>> >>> >>> -- >>> -- Guozhang >>> >> >
signature.asc
Description: OpenPGP digital signature