Just to clarify on 2): currently KIP-307 do not have proposed APIs for `groupBy/groupByKey` naming schemes, and for joins its current proposal is to extend ValueJoiner with Named and hence this part is what I meant to have "overlaps".
Thinking about it a bit more, since Joined is only used for S-S and S-T joins but not T-T joins, having the naming schemes on Joined would not be sufficient, and extending ValueJoiner would indeed be a good choice. As for groupBy, since it is using KeyValueMapper which is supposed to be extended with Named in KIP-307, it does not require extending to processor nodes as well. Given this, I'm fine with limiting the scope to only repartition topics. Guozhang On Wed, Sep 12, 2018 at 10:22 PM, Matthias J. Sax <matth...@confluent.io> wrote: > Follow up comments: > > 1) We should either use `[app-id]-this|other-[join-name]-repartition` or > `app-id]-[join-name]-left|right-repartition` but we should not change > the pattern depending if the user specifies a name of not. I am fine > with both patterns---just want to make sure with stick with one. > > 2) I didn't see why we would need to do this in this KIP. KIP-307 seems > to be orthogonal, and thus KIP-372 should not change any processor > names, but KIP-307 should define a holistic strategy for all processor. > Otherwise, we might up with different strategies or revert what we > decide in this KIP if it's not compatible with KIP-307. > > > -Matthias > > > On 9/12/18 6:28 PM, Guozhang Wang wrote: > > Hello Bill, > > > > I made a pass over your proposal and here are some questions: > > > > 1. For Joined names, the current proposal is to define the repartition > > topic names as > > > > * [app-id]-this-[join-name]-repartition > > > > * [app-id]-other-[join-name]-repartition > > > > > > And if [join-name] not specified, stay the same, which is: > > > > * [previous-processor-name]-repartition for both Stream-Stream (S-S) > join > > and S-T join > > > > I think it is more natural to rename it to > > > > * [app-id]-[join-name]-left-repartition > > > > * [app-id]-[join-name]-right-repartition > > > > > > 2. I'd suggest to use the name to also define the corresponding processor > > names accordingly, in addition to the repartition topic names. Note that > > for joins, this may be overlapping with KIP-307 > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL> > > as > > it also have proposals for defining processor names for join operators as > > well. > > > > 3. Could you also specify how this would affect the optimization for > > merging multiple repartition topics? > > > > 4. In the "Compatibility, Deprecation, and Migration Plan" section, could > > you also mention the following scenarios, if any of the upgrade path > would > > be changed: > > > > a) changing user DSL code: under which scenarios users can now do a > > rolling bounce instead of resetting applications. > > > > b) upgrading from older version to new version, with all the names > > specified, and with optimization turned on. E.g. say we have the code > > written in 2.1 with all names specified, and now upgrading to 2.2 with > new > > optimizations that may potentially change the repartition topics. Is that > > always safe to do? > > > > > > > > Guozhang > > > > > > On Wed, Sep 12, 2018 at 4:52 PM, Bill Bejeck <bbej...@gmail.com> wrote: > > > >> All I'd like to start a discussion on KIP-372 for the naming of joins > and > >> grouping operations in Kafka Streams. > >> > >> The KIP page can be found here: > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >> 372%3A+Naming+Joins+and+Grouping > >> > >> I look forward to feedback and comments. > >> > >> Thanks, > >> Bill > >> > > > > > > > > -- -- Guozhang