Thanks for updating the KIP.

The current `Source` interface has a method `String topics()` atm. Thus,
we cannot just add `Set<String> Source#topics()` because this would
replace the existing method and would be an incompatible change.

I think, we should deprecate `String topics()` and add a method with
different name:

`Set<String> Source#topicNames()`

The method name `topicNames` is more appropriate anyway, as we return a
set of String (ie, names) but no `Topic` objects. This raises one more
thought: we might want to rename `Processor#stores()` to
`Processor#storeNames()` as well ass `Sink#topic()` to
`Sink#topicName()`, too. This would keep the naming in the API consistent.


WDYT? Hope that other can chime in as well.


-Matthias

On 7/18/18 7:49 PM, Nishanth Pradeep wrote:
> I have revised the KIP
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+TopologyDescription+to+better+represent+Source+and+Sink+Nodes>.
> Here is a summary of the changes.
> 
>    1. Changed return type from String to Set<String> for Source#topics().
> 
>    Set<String> Source#topics()
> 
>    2. Added method in TopologyDescription#Source to return the Pattern used
>    by the Source node.
> 
>    Pattern Source#topicPattern()
> 
>    3. Changed return type of Sink#topicNameExtractor from Class<? extends
>    TopicNameExtractor> to just TopicNameExtractor.
> 
>    TopicNameExtractor Sink#topicNameExtractor()
> 
> Best,
> Nishanth Pradeep
> 
> On Sun, Jul 15, 2018 at 11:24 PM Guozhang Wang <wangg...@gmail.com> wrote:
> 
>> Hi Nishanth,
>>
>> Since even combining these two the scope is still relatively small I'd
>> suggest just do it in one KIP if you're willing to work on them. If you do
>> not then pleas feel free to create the JIRA for the second step so that
>> others can pick it up.
>>
>>
>> Guozhang
>>
>> On Sun, Jul 15, 2018 at 6:14 PM, Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>
>>> There is no general protocol. We can include the changes in the current
>>> KIP or do a second KIP.
>>>
>>> If you don't want to include the change in this KIP, please create a new
>>> JIRA to track the other changes. You can assign the JIRA to yourself and
>>> start a second KIP if you want. You can also "link" both JIRAs as
>>> related to each other.
>>>
>>>
>>> -Matthias
>>>
>>> On 7/15/18 12:50 PM, Nishanth Pradeep wrote:
>>>> Thank you for the comments! I agree with these changes.
>>>>
>>>> So is the general protocol to update the same KIP, or is to scrap the
>>>> current KIP and create a new one since it's beyond the scope of the
>>>> original KIP? I am happy to do either.
>>>>
>>>> On Wed, Jul 4, 2018 at 1:48 PM Matthias J. Sax <matth...@confluent.io>
>>>> wrote:
>>>>
>>>>> Sounds good to me.
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 7/4/18 10:53 AM, Guozhang Wang wrote:
>>>>>> After looked through the current TopologyDescription I think I'd want
>>> to
>>>>>> combine the suggestions from John and Matthias on the API proposal.
>> The
>>>>>> motivations is that we have two relatively different functionalities
>>>>>> provided from the APIs today:
>>>>>>
>>>>>> 1. Each interface's public functions, like
>>>>>> SourceNode#topics(), GlobalStore#source(), which returns non-String
>>> typed
>>>>>> data. The hope was to let users programmatically leverage on those
>> APIs
>>>>> for
>>>>>> runtime checking.
>>>>>> 2. Each interface's impl class also have an implicit toString()
>>>>> overridden
>>>>>> to print the necessary information. This was designed for debugging
>>>>>> purposes only during development cycles.
>>>>>>
>>>>>> What we've observed so far, though, is that users leverage 2) much
>> more
>>>>>> than 1) in practice, since it is more convienent to parse strings
>> than
>>>>>> recursively calling the APIs to get non-string fields. On the other
>>> hand,
>>>>>> the discussion controversy is more around 1), not 2). As for 2)
>> people
>>>>> seem
>>>>>> to be on the right page anyways: print the topic lists if it is not
>>>>>> dynamic, or print extractor string format otherwise. For 1) above we
>>>>> should
>>>>>> probably have all three `Set<String> topics()`, `Pattern
>>> topicPattern()`
>>>>>> and `TopicNameExtractor topicExtractor()`; while for 2) I feel
>>>>> comfortable
>>>>>> relying on the TopicNameExtractor#toString() in `Source#toString()`
>>> impl
>>>>>> since even if users do not override this function, the default value
>>>>>> `className@hashcode` still looks fine to me.
>>>>>>
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 3, 2018 at 11:22 PM, Matthias J. Sax <
>>> matth...@confluent.io>
>>>>>> wrote:
>>>>>>
>>>>>>> I just double checked the discussion thread of KIP-120 that
>> introduced
>>>>>>> `TopologyDescription`. Back than the argument was, that using the
>>>>>>> simplest option might be sufficient because the description is
>> mostly
>>>>>>> used for debugging.
>>>>>>>
>>>>>>> Not sure if this argument holds. It seem that people built first
>> more
>>>>>>> sophisticated tools using TopologyDescription.
>>>>>>>
>>>>>>> Final note: if we really want to add `topicPattern()` we might want
>> to
>>>>>>> deprecate `topic()` and replace with `Set<String> topics()`,
>> because a
>>>>>>> source node can take a multiple topics, too.
>>>>>>>
>>>>>>> Just adding this for completeness of context to the discussion.
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>> On 7/3/18 11:09 PM, Matthias J. Sax wrote:
>>>>>>>> John,
>>>>>>>>
>>>>>>>> I am a little bit on the fence. In retrospective, it might have
>> been
>>>>>>>> better to add `topic()` and `topicPattern()` to source node and
>>> return
>>>>> a
>>>>>>>> proper `Pattern` object instead of the pattern as a String.
>>>>>>>>
>>>>>>>> All other "payload" is just names and thus String naturally. From
>> my
>>>>>>>> point of view `TopologyDescription` should represent the `Topology`
>>> in
>>>>> a
>>>>>>>> "machine readable" form plus a default "human readable" from via
>>>>>>>> `toString()` -- this does not imply that all return types should be
>>>>>>> String.
>>>>>>>>
>>>>>>>> Let me know what you think. If you agree, we could even add
>>>>>>>> `Source#topicPattern()` in another KIP.
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>> On 6/26/18 3:45 PM, John Roesler wrote:
>>>>>>>>> Sorry for the late comment,
>>>>>>>>>
>>>>>>>>> Looking at the other pieces of TopologyDescription, I noticed that
>>>>>>> pretty
>>>>>>>>> much all of the "payload" of these description nodes are strings.
>>>>>>> Should we
>>>>>>>>> consider returning a string from `topicNameExtractor()` instead?
>>>>>>>>>
>>>>>>>>> In fact, if we did that, we could consider calling `toString()` on
>>> the
>>>>>>>>> extractor instead of returning the class name. This would allow
>>>>> authors
>>>>>>> of
>>>>>>>>> the extractors to provide more information about the extractor
>> than
>>>>> just
>>>>>>>>> its name. This might be especially useful in the case of anonymous
>>>>>>>>> implementations.
>>>>>>>>>
>>>>>>>>> Thanks for the KIP,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu <yuzhih...@gmail.com>
>>> wrote:
>>>>>>>>>
>>>>>>>>>> My previous response was talking about the new method in
>>>>>>>>>> InternalTopologyBuilder.
>>>>>>>>>> The exception just means there is no uniform extractor for all
>> the
>>>>>>> sinks.
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax <
>>>>>>> matth...@confluent.io>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ted,
>>>>>>>>>>>
>>>>>>>>>>> Why? Each sink can have a different TopicNameExtractor.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Matthias
>>>>>>>>>>>
>>>>>>>>>>> On 6/25/18 5:19 PM, Ted Yu wrote:
>>>>>>>>>>>> If there are different TopicNameExtractor classes from multiple
>>>>> sink
>>>>>>>>>>> nodes,
>>>>>>>>>>>> the new method should throw exception alerting user of such
>>>>> scenario.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck <
>> bbej...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the KIP!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Overall I'm +1 on the KIP.   I have one question.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The KIP states that the method "topicNameExtractor()" is added
>>> to
>>>>>>> the
>>>>>>>>>>>>> InternalTopologyBuilder.java.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It could be that I'm missing something, but wow does this work
>>> if
>>>>> a
>>>>>>>>>> user
>>>>>>>>>>>>> has provided different TopicNameExtractor instances to
>> multiple
>>>>> sink
>>>>>>>>>>> nodes?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Bill
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang <
>>> wangg...@gmail.com
>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yup I agree, generally speaking the `toString()` output is
>> not
>>>>>>>>>>>>> recommended
>>>>>>>>>>>>>> to be relied on programmatically in user's code, but we've
>>>>> observed
>>>>>>>>>>>>>> convenience-beats-any-other-reasons again and again in
>>>>> development
>>>>>>>>>>>>>> unfortunately. I think we should still not claiming it is
>> part
>>> of
>>>>>>> the
>>>>>>>>>>>>>> public APIs that would not be changed anyhow in the future,
>> but
>>>>>>> just
>>>>>>>>>>>>>> mentioning it in the wiki for people to be aware is fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax <
>>>>>>>>>>> matth...@confluent.io>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the KIP!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am don't have any further comments.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For Guozhang's comment: if we mention anything about
>>>>> `toString()`,
>>>>>>>>>> we
>>>>>>>>>>>>>>> should make explicit that `toString()` output is still not
>>>>> public
>>>>>>>>>>>>>>> contract and users should not rely on the output.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Furhtermore, for the actual uses output, I would replace
>>>>> "topic:"
>>>>>>> by
>>>>>>>>>>>>>>> "extractor class:" to make the difference obvious.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am just hoping that people actually to not rely on
>>>>> `toString()`
>>>>>>>>>> what
>>>>>>>>>>>>>>> defeats the purpose to the `TopologyDescription` class that
>>> was
>>>>>>>>>>>>>>> introduced to avoid the dependency... (Just a side comment,
>>> not
>>>>>>>>>> really
>>>>>>>>>>>>>>> related to this KIP proposal itself).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If there are no further comments in the next days, feel free
>>> to
>>>>>>>>>> start
>>>>>>>>>>>>>>> the VOTE and open a PR.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 6/22/18 6:04 PM, Guozhang Wang wrote:
>>>>>>>>>>>>>>>> Thanks for writing the KIP!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm +1 on the proposed changes over all. One minor
>>> suggestion:
>>>>> we
>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> also mention that the `Sink#toString` will also be updated,
>>> in
>>>>> a
>>>>>>>>>> way
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> if `topic()` returns null, use the other call, etc. This is
>>>>>>> because
>>>>>>>>>>>>>>>> although we do not explicitly state the following logic as
>>>>> public
>>>>>>>>>>>>>>> protocols:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Sink: " + name + " (topic: " + topic() + ")\n      <-- " +
>>>>>>>>>>>>>>>> nodeNames(predecessors);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There are already some users that rely on
>>>>>>>>>>>>>> `topology.describe().toString(
>>>>>>>>>>>>>>> )`
>>>>>>>>>>>>>>>> to have runtime checks on the returned string values, so
>>>>> changing
>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> means that their app will break and hence need to make code
>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
>>>>>>>>>>>>>> nishanth...@gmail.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello Everyone,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have created a new KIP to discuss extending
>>>>>>> TopologyDescription.
>>>>>>>>>>>>> You
>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>> find it here:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>>>>>> 321%3A+Add+method+to+get+TopicNameExtractor+in+
>>>>>>> TopologyDescription
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please provide any feedback that you might have.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Nishanth Pradeep
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> -- Guozhang
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to