Hi folks, I know we don't normally have a "Related work" section in KIPs, but sometimes I find it useful to see what others have done in similar cases. Since this will be important for rolling re-deployments, I wonder what other frameworks like Flink (or Samza) have done in these cases. Perhaps they have done nothing, in which case it's fine to do this from first principles, but IMO it would be good to know just to make sure we're heading in the right direction.
Also I don't get a good feel for how much work this will be for an end user who is doing the rolling deployment, perhaps an end-to-end example would help. Thanks Eno On Thu, Sep 13, 2018 at 6:22 AM, Matthias J. Sax <matth...@confluent.io> wrote: > Follow up comments: > > 1) We should either use `[app-id]-this|other-[join-name]-repartition` or > `app-id]-[join-name]-left|right-repartition` but we should not change > the pattern depending if the user specifies a name of not. I am fine > with both patterns---just want to make sure with stick with one. > > 2) I didn't see why we would need to do this in this KIP. KIP-307 seems > to be orthogonal, and thus KIP-372 should not change any processor > names, but KIP-307 should define a holistic strategy for all processor. > Otherwise, we might up with different strategies or revert what we > decide in this KIP if it's not compatible with KIP-307. > > > -Matthias > > > On 9/12/18 6:28 PM, Guozhang Wang wrote: > > Hello Bill, > > > > I made a pass over your proposal and here are some questions: > > > > 1. For Joined names, the current proposal is to define the repartition > > topic names as > > > > * [app-id]-this-[join-name]-repartition > > > > * [app-id]-other-[join-name]-repartition > > > > > > And if [join-name] not specified, stay the same, which is: > > > > * [previous-processor-name]-repartition for both Stream-Stream (S-S) > join > > and S-T join > > > > I think it is more natural to rename it to > > > > * [app-id]-[join-name]-left-repartition > > > > * [app-id]-[join-name]-right-repartition > > > > > > 2. I'd suggest to use the name to also define the corresponding processor > > names accordingly, in addition to the repartition topic names. Note that > > for joins, this may be overlapping with KIP-307 > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL> > > as > > it also have proposals for defining processor names for join operators as > > well. > > > > 3. Could you also specify how this would affect the optimization for > > merging multiple repartition topics? > > > > 4. In the "Compatibility, Deprecation, and Migration Plan" section, could > > you also mention the following scenarios, if any of the upgrade path > would > > be changed: > > > > a) changing user DSL code: under which scenarios users can now do a > > rolling bounce instead of resetting applications. > > > > b) upgrading from older version to new version, with all the names > > specified, and with optimization turned on. E.g. say we have the code > > written in 2.1 with all names specified, and now upgrading to 2.2 with > new > > optimizations that may potentially change the repartition topics. Is that > > always safe to do? > > > > > > > > Guozhang > > > > > > On Wed, Sep 12, 2018 at 4:52 PM, Bill Bejeck <bbej...@gmail.com> wrote: > > > >> All I'd like to start a discussion on KIP-372 for the naming of joins > and > >> grouping operations in Kafka Streams. > >> > >> The KIP page can be found here: > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >> 372%3A+Naming+Joins+and+Grouping > >> > >> I look forward to feedback and comments. > >> > >> Thanks, > >> Bill > >> > > > > > > > >