Hi David,

Thanks for the feedback. They are really helpful.

> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?

Yes. The user-defined internal topics (those except `__consumer_offsets` and 
`__transaction_state`) will behave as normal topics in regard to messaging 
operation and permission. Topics are marked as “internal” in order to make the 
broker able to test user-defined internal topics and better provide metadata 
services, such as `listTopics` API. I should have added the metadata behavior 
difference in the KIP.

> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).

Thanks for the suggestion. I updated the section.

> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.


I think we should probably give clients the freedom to configure 
`min.insync.replicas`, `replication.factor`, and `log.retention` on 
user-defined internal topics as they do on normal topics.

1. Users may have performance requirements on user-defined internal topics.
2. Potential new defaults / restrictions may change the existing user 
application logic silently. There might be compatibility issues.
3. Since user-defined internal topics act like normal topics and won’t affect 
the messaging functionality (produce, consume, transaction, etc), unoptimized 
log configurations won’t harm the cluster. 


Please let me know what you think. Thanks.


Best, - Cheng Tan



> On Aug 14, 2020, at 7:44 AM, David Arthur <david.art...@confluent.io> wrote:
> 
> Cheng,
> 
> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?
> 
> 
> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).
> 
> 
> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.
> 
> 
> Thanks!
> David
> 
> 
> 
> On Tue, Jul 7, 2020 at 1:40 PM Cheng Tan <c...@confluent.io> wrote:
> 
>> Hi Colin,
>> 
>> 
>> Thanks for the comments. I’ve modified the KIP accordingly.
>> 
>>> I think we need to understand which of these limitations we will carry
>> forward and which we will not.  We also have the option of putting
>> limitations just on consumer offsets, but not on other internal topics.
>> 
>> 
>> In the proposal, I added details about this. I agree that cluster admin
>> should use ACLs to apply the restrictions.
>> Internal topic creation will be allowed.
>> Internal topic deletion will be allowed except for` __consumer_offsets`
>> and `__transaction_state`.
>> Producing to internal topic partitions other than `__consumer_offsets` and
>> `__transaction_state` will be allowed.
>> Adding internal topic partitions to transactions will be allowed.
>>> I think there are a fair number of compatibility concerns.  What's the
>> result if someone tries to create a topic with the configuration internal =
>> true right now?  Does it fail?  If not, that seems like a potential problem.
>> 
>> I also added this compatibility issue in the "Compatibility, Deprecation,
>> and Migration Plan" section.
>> 
>> Please feel free to make any suggestions or comments regarding to my
>> latest proposal. Thanks.
>> 
>> 
>> Best, - Cheng Tan
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Jun 15, 2020, at 11:18 AM, Colin McCabe <cmcc...@apache.org> wrote:
>>> 
>>> Hi Cheng,
>>> 
>>> The link from the main KIP page is an "edit link" meaning that it drops
>> you into the editor for the wiki page.  I think the link you meant to use
>> is a "view link" that will just take you to view the page.
>>> 
>>> In general I'm not sure what I'm supposed to take away from the large
>> UML diagram in the KIP.  This is just a description of the existing code,
>> right?  Seems like we should remove this.
>>> 
>>> I'm not sure why the controller classes are featured here since as far
>> as I can tell, the controller doesn't need to care if a topic is internal.
>>> 
>>>> Kafka and its upstream applications treat internal topics differently
>> from
>>>> non-internal topics. For example:
>>>> * Kafka handles topic creation response errors differently for internal
>> topics
>>>> * Internal topic partitions cannot be added to a transaction
>>>> * Internal topic records cannot be deleted
>>>> * Appending to internal topics might get rejected
>>> 
>>> I think we need to understand which of these limitations we will carry
>> forward and which we will not.  We also have the option of putting
>> limitations just on consumer offsets, but not on other internal topics.
>>> 
>>> Taking it one by one:
>>> 
>>>> * Kafka handles topic creation response errors differently for internal
>> topics.
>>> 
>>> Hmm.  Kafka doesn't currently allow you to create internal topics, so
>> the difference here is that you always fail, right?  Or is there something
>> else more subtle here?  Like do we specifically prevent you from creating
>> topics named __consumer_offsets or something?  We need to spell this all
>> out in the KIP.
>>> 
>>>> * Internal topic partitions cannot be added to a transaction
>>> 
>>> I don't think we should carry this limitation forward, or if we do, we
>> should only do it for consumer-offsets.  Does anyone know why this
>> limitation exists?
>>> 
>>>> * Internal topic records cannot be deleted
>>> 
>>> This seems like something that should be handled by ACLs rather than by
>> treating internal topics specially.
>>> 
>>>> * Appending to internal topics might get rejected
>>> 
>>> We clearly need to use ACLs here rather than rejecting appends.
>> Otherwise, how will external systems like KSQL, streams, etc. use this
>> feature?  This is the kind of information we need to have in the KIP.
>>> 
>>>> Public Interfaces
>>>> 2. KafkaZkClient will have a new method getInternalTopics() which
>>>> returns a set of internal topic name strings.
>>> 
>>> KafkaZkClient isn't a public interface, so it doesn't need to be
>> described here.
>>> 
>>>> There are no compatibility concerns in this KIP.
>>> 
>>> I think there are a fair number of compatibility concerns.  What's the
>> result if someone tries to create a topic with the configuration internal =
>> true right now?  Does it fail?  If not, that seems like a potential problem.
>>> 
>>> Are people going to be able to create or delete topics named
>> __consumer_offsets or __transaction_state using this mechanism?  If so, how
>> does the security model work for that?
>>> 
>>> best,
>>> Colin
>>> 
>>> On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
>>>> Hello developers,
>>>> 
>>>> 
>>>> I’m proposing KIP-619 to add internal topic creation support.
>>>> 
>>>> Kafka and its upstream applications treat internal topics differently
>>>> from non-internal topics. For example:
>>>> 
>>>>     • Kafka handles topic creation response errors differently for
>> internal topics
>>>>     • Internal topic partitions cannot be added to a transaction
>>>>     • Internal topic records cannot be deleted
>>>>     • Appending to internal topics might get rejected
>>>>     • ……
>>>> 
>>>> Clients and upstream applications may define their own internal topics.
>>>> For example, Kafka Connect defines `connect-configs`,
>>>> `connect-offsets`, and `connect-statuses`. Clients are fetching the
>>>> internal topics by sending the MetadataRequest (ApiKeys.METADATA).
>>>> 
>>>> However, clients and upstream application cannot register their own
>>>> internal topics in servers. As a result, servers have no knowledge
>>>> about client-defined internal topics. They can only test if a given
>>>> topic is internal or not simply by checking against a static set of
>>>> internal topic string, which consists of two internal topic names
>>>> `__consumer_offsets` and `__transaction_state`. As a result,
>>>> MetadataRequest cannot provide any information about client created
>>>> internal topics.
>>>> 
>>>> To solve this pain point, I'm proposing support for clients to register
>>>> and query their own internal topics.
>>>> 
>>>> Please feel free to join the discussion. Thanks in advance.
>>>> 
>>>> 
>>>> Best, - Cheng Tan
>> 
>> 
> 
> -- 
> -David

Reply via email to