Personally I'd prefer to keep the deprecation-related changes as small as possible unless they are really necessary, and hence I'd prefer to just add
List<String> topicList() /* or Set<String> topicSet() */ in addition to topicPattern to Source, in addition to `topicNameExtractor` to Sink, and leaving the current APIs as-is. Guozhang On Thu, Jul 19, 2018 at 10:36 AM, Matthias J. Sax <matth...@confluent.io> wrote: > Thanks for updating the KIP. > > The current `Source` interface has a method `String topics()` atm. Thus, > we cannot just add `Set<String> Source#topics()` because this would > replace the existing method and would be an incompatible change. > > I think, we should deprecate `String topics()` and add a method with > different name: > > `Set<String> Source#topicNames()` > > The method name `topicNames` is more appropriate anyway, as we return a > set of String (ie, names) but no `Topic` objects. This raises one more > thought: we might want to rename `Processor#stores()` to > `Processor#storeNames()` as well ass `Sink#topic()` to > `Sink#topicName()`, too. This would keep the naming in the API consistent. > > > WDYT? Hope that other can chime in as well. > > > -Matthias > > On 7/18/18 7:49 PM, Nishanth Pradeep wrote: > > I have revised the KIP > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+ > TopologyDescription+to+better+represent+Source+and+Sink+Nodes>. > > Here is a summary of the changes. > > > > 1. Changed return type from String to Set<String> for Source#topics(). > > > > Set<String> Source#topics() > > > > 2. Added method in TopologyDescription#Source to return the Pattern > used > > by the Source node. > > > > Pattern Source#topicPattern() > > > > 3. Changed return type of Sink#topicNameExtractor from Class<? extends > > TopicNameExtractor> to just TopicNameExtractor. > > > > TopicNameExtractor Sink#topicNameExtractor() > > > > Best, > > Nishanth Pradeep > > > > On Sun, Jul 15, 2018 at 11:24 PM Guozhang Wang <wangg...@gmail.com> > wrote: > > > >> Hi Nishanth, > >> > >> Since even combining these two the scope is still relatively small I'd > >> suggest just do it in one KIP if you're willing to work on them. If you > do > >> not then pleas feel free to create the JIRA for the second step so that > >> others can pick it up. > >> > >> > >> Guozhang > >> > >> On Sun, Jul 15, 2018 at 6:14 PM, Matthias J. Sax <matth...@confluent.io > > > >> wrote: > >> > >>> There is no general protocol. We can include the changes in the current > >>> KIP or do a second KIP. > >>> > >>> If you don't want to include the change in this KIP, please create a > new > >>> JIRA to track the other changes. You can assign the JIRA to yourself > and > >>> start a second KIP if you want. You can also "link" both JIRAs as > >>> related to each other. > >>> > >>> > >>> -Matthias > >>> > >>> On 7/15/18 12:50 PM, Nishanth Pradeep wrote: > >>>> Thank you for the comments! I agree with these changes. > >>>> > >>>> So is the general protocol to update the same KIP, or is to scrap the > >>>> current KIP and create a new one since it's beyond the scope of the > >>>> original KIP? I am happy to do either. > >>>> > >>>> On Wed, Jul 4, 2018 at 1:48 PM Matthias J. Sax <matth...@confluent.io > > > >>>> wrote: > >>>> > >>>>> Sounds good to me. > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 7/4/18 10:53 AM, Guozhang Wang wrote: > >>>>>> After looked through the current TopologyDescription I think I'd > want > >>> to > >>>>>> combine the suggestions from John and Matthias on the API proposal. > >> The > >>>>>> motivations is that we have two relatively different functionalities > >>>>>> provided from the APIs today: > >>>>>> > >>>>>> 1. Each interface's public functions, like > >>>>>> SourceNode#topics(), GlobalStore#source(), which returns non-String > >>> typed > >>>>>> data. The hope was to let users programmatically leverage on those > >> APIs > >>>>> for > >>>>>> runtime checking. > >>>>>> 2. Each interface's impl class also have an implicit toString() > >>>>> overridden > >>>>>> to print the necessary information. This was designed for debugging > >>>>>> purposes only during development cycles. > >>>>>> > >>>>>> What we've observed so far, though, is that users leverage 2) much > >> more > >>>>>> than 1) in practice, since it is more convienent to parse strings > >> than > >>>>>> recursively calling the APIs to get non-string fields. On the other > >>> hand, > >>>>>> the discussion controversy is more around 1), not 2). As for 2) > >> people > >>>>> seem > >>>>>> to be on the right page anyways: print the topic lists if it is not > >>>>>> dynamic, or print extractor string format otherwise. For 1) above we > >>>>> should > >>>>>> probably have all three `Set<String> topics()`, `Pattern > >>> topicPattern()` > >>>>>> and `TopicNameExtractor topicExtractor()`; while for 2) I feel > >>>>> comfortable > >>>>>> relying on the TopicNameExtractor#toString() in `Source#toString()` > >>> impl > >>>>>> since even if users do not override this function, the default value > >>>>>> `className@hashcode` still looks fine to me. > >>>>>> > >>>>>> > >>>>>> Guozhang > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Jul 3, 2018 at 11:22 PM, Matthias J. Sax < > >>> matth...@confluent.io> > >>>>>> wrote: > >>>>>> > >>>>>>> I just double checked the discussion thread of KIP-120 that > >> introduced > >>>>>>> `TopologyDescription`. Back than the argument was, that using the > >>>>>>> simplest option might be sufficient because the description is > >> mostly > >>>>>>> used for debugging. > >>>>>>> > >>>>>>> Not sure if this argument holds. It seem that people built first > >> more > >>>>>>> sophisticated tools using TopologyDescription. > >>>>>>> > >>>>>>> Final note: if we really want to add `topicPattern()` we might want > >> to > >>>>>>> deprecate `topic()` and replace with `Set<String> topics()`, > >> because a > >>>>>>> source node can take a multiple topics, too. > >>>>>>> > >>>>>>> Just adding this for completeness of context to the discussion. > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> On 7/3/18 11:09 PM, Matthias J. Sax wrote: > >>>>>>>> John, > >>>>>>>> > >>>>>>>> I am a little bit on the fence. In retrospective, it might have > >> been > >>>>>>>> better to add `topic()` and `topicPattern()` to source node and > >>> return > >>>>> a > >>>>>>>> proper `Pattern` object instead of the pattern as a String. > >>>>>>>> > >>>>>>>> All other "payload" is just names and thus String naturally. From > >> my > >>>>>>>> point of view `TopologyDescription` should represent the > `Topology` > >>> in > >>>>> a > >>>>>>>> "machine readable" form plus a default "human readable" from via > >>>>>>>> `toString()` -- this does not imply that all return types should > be > >>>>>>> String. > >>>>>>>> > >>>>>>>> Let me know what you think. If you agree, we could even add > >>>>>>>> `Source#topicPattern()` in another KIP. > >>>>>>>> > >>>>>>>> > >>>>>>>> -Matthias > >>>>>>>> > >>>>>>>> On 6/26/18 3:45 PM, John Roesler wrote: > >>>>>>>>> Sorry for the late comment, > >>>>>>>>> > >>>>>>>>> Looking at the other pieces of TopologyDescription, I noticed > that > >>>>>>> pretty > >>>>>>>>> much all of the "payload" of these description nodes are strings. > >>>>>>> Should we > >>>>>>>>> consider returning a string from `topicNameExtractor()` instead? > >>>>>>>>> > >>>>>>>>> In fact, if we did that, we could consider calling `toString()` > on > >>> the > >>>>>>>>> extractor instead of returning the class name. This would allow > >>>>> authors > >>>>>>> of > >>>>>>>>> the extractors to provide more information about the extractor > >> than > >>>>> just > >>>>>>>>> its name. This might be especially useful in the case of > anonymous > >>>>>>>>> implementations. > >>>>>>>>> > >>>>>>>>> Thanks for the KIP, > >>>>>>>>> -John > >>>>>>>>> > >>>>>>>>> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu <yuzhih...@gmail.com> > >>> wrote: > >>>>>>>>> > >>>>>>>>>> My previous response was talking about the new method in > >>>>>>>>>> InternalTopologyBuilder. > >>>>>>>>>> The exception just means there is no uniform extractor for all > >> the > >>>>>>> sinks. > >>>>>>>>>> > >>>>>>>>>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax < > >>>>>>> matth...@confluent.io> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Ted, > >>>>>>>>>>> > >>>>>>>>>>> Why? Each sink can have a different TopicNameExtractor. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -Matthias > >>>>>>>>>>> > >>>>>>>>>>> On 6/25/18 5:19 PM, Ted Yu wrote: > >>>>>>>>>>>> If there are different TopicNameExtractor classes from > multiple > >>>>> sink > >>>>>>>>>>> nodes, > >>>>>>>>>>>> the new method should throw exception alerting user of such > >>>>> scenario. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck < > >> bbej...@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the KIP! > >>>>>>>>>>>>> > >>>>>>>>>>>>> Overall I'm +1 on the KIP. I have one question. > >>>>>>>>>>>>> > >>>>>>>>>>>>> The KIP states that the method "topicNameExtractor()" is > added > >>> to > >>>>>>> the > >>>>>>>>>>>>> InternalTopologyBuilder.java. > >>>>>>>>>>>>> > >>>>>>>>>>>>> It could be that I'm missing something, but wow does this > work > >>> if > >>>>> a > >>>>>>>>>> user > >>>>>>>>>>>>> has provided different TopicNameExtractor instances to > >> multiple > >>>>> sink > >>>>>>>>>>> nodes? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> Bill > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang < > >>> wangg...@gmail.com > >>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Yup I agree, generally speaking the `toString()` output is > >> not > >>>>>>>>>>>>> recommended > >>>>>>>>>>>>>> to be relied on programmatically in user's code, but we've > >>>>> observed > >>>>>>>>>>>>>> convenience-beats-any-other-reasons again and again in > >>>>> development > >>>>>>>>>>>>>> unfortunately. I think we should still not claiming it is > >> part > >>> of > >>>>>>> the > >>>>>>>>>>>>>> public APIs that would not be changed anyhow in the future, > >> but > >>>>>>> just > >>>>>>>>>>>>>> mentioning it in the wiki for people to be aware is fine. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax < > >>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks for the KIP! > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I am don't have any further comments. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> For Guozhang's comment: if we mention anything about > >>>>> `toString()`, > >>>>>>>>>> we > >>>>>>>>>>>>>>> should make explicit that `toString()` output is still not > >>>>> public > >>>>>>>>>>>>>>> contract and users should not rely on the output. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Furhtermore, for the actual uses output, I would replace > >>>>> "topic:" > >>>>>>> by > >>>>>>>>>>>>>>> "extractor class:" to make the difference obvious. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I am just hoping that people actually to not rely on > >>>>> `toString()` > >>>>>>>>>> what > >>>>>>>>>>>>>>> defeats the purpose to the `TopologyDescription` class that > >>> was > >>>>>>>>>>>>>>> introduced to avoid the dependency... (Just a side comment, > >>> not > >>>>>>>>>> really > >>>>>>>>>>>>>>> related to this KIP proposal itself). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> If there are no further comments in the next days, feel > free > >>> to > >>>>>>>>>> start > >>>>>>>>>>>>>>> the VOTE and open a PR. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 6/22/18 6:04 PM, Guozhang Wang wrote: > >>>>>>>>>>>>>>>> Thanks for writing the KIP! > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I'm +1 on the proposed changes over all. One minor > >>> suggestion: > >>>>> we > >>>>>>>>>>>>>> should > >>>>>>>>>>>>>>>> also mention that the `Sink#toString` will also be > updated, > >>> in > >>>>> a > >>>>>>>>>> way > >>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>> if `topic()` returns null, use the other call, etc. This > is > >>>>>>> because > >>>>>>>>>>>>>>>> although we do not explicitly state the following logic as > >>>>> public > >>>>>>>>>>>>>>> protocols: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> ``` > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> "Sink: " + name + " (topic: " + topic() + ")\n <-- " > + > >>>>>>>>>>>>>>>> nodeNames(predecessors); > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> ``` > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> There are already some users that rely on > >>>>>>>>>>>>>> `topology.describe().toString( > >>>>>>>>>>>>>>> )` > >>>>>>>>>>>>>>>> to have runtime checks on the returned string values, so > >>>>> changing > >>>>>>>>>>>>> this > >>>>>>>>>>>>>>>> means that their app will break and hence need to make > code > >>>>>>>>>> changes. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep < > >>>>>>>>>>>>>> nishanth...@gmail.com > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hello Everyone, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I have created a new KIP to discuss extending > >>>>>>> TopologyDescription. > >>>>>>>>>>>>> You > >>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>> find it here: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>> 321%3A+Add+method+to+get+TopicNameExtractor+in+ > >>>>>>> TopologyDescription > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Please provide any feedback that you might have. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>> Nishanth Pradeep > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > >> > >> -- > >> -- Guozhang > >> > > > > -- -- Guozhang