Re: [DISCUSS] KIP-1111: Enforcing Explicit Naming for Kafka Streams Internal Topics

Lucas Brutschy Thu, 28 Nov 2024 02:51:38 -0800

Hi all,

Thanks for the KIP! Super useful.


No new comments, let me just voice my opinion on the suggestions being made.

A1: When I read `require.auto.generated.topic.names`, it sounds like
explicit naming is not allowed if the config is true. This is not what
we are doing here. So to avoid the negative, I'd use
`require.explicit.topic.names`, default false.

BC+1: I think `invariant` is quite indirect, because it depends on
what changes are being made for the naming to change or not. Also,
it's not only about changelog topics but also repartition topics, so
`persistance` seems misleading as well. I agree, IQ can benefit, but
it seems that it is more of a side effect of the feature?

BC+2: Big +1 on this.

Cheers,
Lucas




On Tue, Nov 26, 2024 at 9:37 PM Sebastien Viale
<sebastien.vi...@michelin.com> wrote:
>
> Thanks Bruno for your comments:
>
> A1 .
> BC+1.
> I let other people give their advice, and each of them makes sense.
>
> BC+2.
> I believe you're right. I think I can manage to check whether names are set 
> for repartition or changelog topics across different DSL operators in 
> KStreamImpl. The complexity arises because repartition topic names can depend 
> on the operator, such as selectKey before a join.
> If I detect unnamed topics, I could introduce a property like 
> unNamedInternalTopics in  InternalStreamBuilder to maintain a list of unnamed 
> topics that I can check when topology is built. This approach would allow me 
> to focus not only on sequence numbers but also on ensuring explicit topic 
> naming for improved stability. I hope this is what you imagine.
> Let me know if you need further refinements!
>
> Sébastien,
>
>
>
> ________________________________
> De : Bruno Cadonna <cado...@apache.org>
> Envoyé : mardi 26 novembre 2024 19:12
> À : dev@kafka.apache.org <dev@kafka.apache.org>
> Objet : [EXT] Re: [DISCUSS] KIP-1111: Enforcing Explicit Naming for Kafka 
> Streams Internal Topics
>
> Warning External sender Do not click on any links or open any attachments 
> unless you trust the sender and know the content is safe.
>
> Hi,
>
> Thanks for the KIP!
>
> A1.
> I find the "auto.generated.topic.names" part a bit misleading, because
> actually the topic names are always auto-generated no matter if the
> processors and state stores are explicitly named or not.
>
> which leads me to
>
> BC+1.
> IMO this is not only about topic names, explicitly naming state stores
> would also benefit existing IQ queries. So the config should not
> necessarily be only focused on internal topics.
> Maybe a config name like invariant.persistence.naming or
> ensure.invariant.persistence.naming or
> enabled.invariant.persistence.naming. I am not too happy with the names
> but I hope you get the idea.
>
> BC+2.
> Can we not keep track of explicit naming within the DSL instead of
> relying on the 10-digit sequence number?
>
> Best,
> Bruno
>
> This email was screened for spam and malicious content but exercise caution 
> anyway.
>
>
>
> On 22.11.24 22:04, Sebastien Viale wrote:
> > hi,
> > A1. Personally, I prefer the suggestion: require.auto.generated.topic.names.
> > If everyone agrees, I will update the KIP accordingly.
> > MJS-1. Sophie, I reviewed the ticket and the sub-tasks, and everything 
> > seems clear to me. I must say that I sometimes felt confused with the 
> > constructors, but now everything is much clearer. Based on my experience 
> > and discussions with my teammates, I believe this is a good opportunity to 
> > simplify and clarify the implementation.
> > The question is whether this should be part of a separate KIP.
> > I don’t see any issue with integrating it into the current one, but if 
> > necessary, I’m happy to volunteer to open a new KIP.
> > In both cases, would it be okay for you if I validate it with you before 
> > publishing?
> > cheers !
> >
> >
> >
> > ________________________________
> > De : Sophie Blee-Goldman <sop...@responsive.dev>
> > Envoyé : vendredi 22 novembre 2024 01:33
> > À : dev@kafka.apache.org <dev@kafka.apache.org>
> > Objet : [EXT] Re: [DISCUSS] KIP-1111: Enforcing Explicit Naming for Kafka 
> > Streams Internal Topics
> >
> > Warning External sender Do not click on any links or open any attachments 
> > unless you trust the sender and know the content is safe.
> >
> > First off, thanks for the KIP! I think this is a great idea as it's super
> > easy to miss naming one thing and end up with a topology that isn't
> > upgradeable.
> >
> > A1. I actually had the same reaction as Almog to the name, I feel it's
> > slightly clearer as a positive instead of a negative, though I think the
> > rest of it is fine -- what about "require.auto.generated.topic.names"
> > instead?
> >
> > MJS-1. While I can see the advantage of having this stuff be
> > programmatically configured, I personally would prefer to keep this and
> > other topology-related configurations part of StreamsConfig. For one thing,
> > I think they would be much more discoverable as true configs. For another,
> > this still wouldn't solve the problem for configs that need to be passed in
> > up front, that is, to the Topology/StreamsBuilder constructor rather than
> > to StreamsBuilder#build. And imo whatever we do should work for all such
> > configs that are required for building a topology.
> >
> > Sebastien -- have you read through that ticket and its subtasks yet? I'd be
> > interested in your take on what I'm proposing there, as I think it could be
> > mutually beneficial. For context, I'm currently implementing a KIP that
> > introduces another such config in this category which needs to be passed
> > into the topology.
> >
> > This email was screened for spam and malicious content but exercise caution 
> > anyway.
> >
> >
> >
> > On Thu, Nov 21, 2024 at 12:42 PM Matthias J. Sax <mj...@apache.org> wrote:
> >
> >> Yes, the ticket is related. Also just saw it today.
> >>
> >> Did leave a comment on the ticket for Sophie to hopefully chime in on
> >> this KIP discussion.
> >>
> >>
> >> -Matthias
> >>
> >> On 11/21/24 9:06 AM, Sebastien Viale wrote:
> >>> Thank you very much for your reviews!
> >>>
> >>> A1 I will keep the disallow.auto.generated.topic.names configuration in
> >> the KIP for now while waiting for other suggestions.
> >>>
> >>> A2 I thought about making the implementation typesafe; it would, of
> >> course, complicate the implementation, but for me,
> >>> this feature is intended for new Kafka Streams applications or those
> >> after a reset. I agree with Matthias.
> >>> If a user wants to enable the check, they will simply need to avoid
> >> naming topics like the auto-generated ones.
> >>>
> >>> MJS-1 It might be a good idea to add your proposal to the KIP. I just
> >> wonder how to distinguish configurations that must be set
> >> 'programmatically' from others (e.g., topology optimization and this one)
> >>> I am open to any suggestions.
> >>>
> >>> Is this ticket related to what you are proposing
> >>> https://lists.apache.org/thread/dfgd2vcco7d1omjptfqp92kdocnlf3cq<https://lists.apache.org/thread/dfgd2vcco7d1omjptfqp92kdocnlf3cq><https://lists.apache.org/thread/dfgd2vcco7d1omjptfqp92kdocnlf3cq<https://lists.apache.org/thread/dfgd2vcco7d1omjptfqp92kdocnlf3cq>>
> >>>
> >>>
> >>> cheers,
> >>> Sébastien
> >>>
> >>>
> >>>
> >>> ________________________________
> >>> De : Matthias J. Sax <mj...@apache.org>
> >>> Envoyé : jeudi 21 novembre 2024 02:48
> >>> À : dev@kafka.apache.org <dev@kafka.apache.org>
> >>> Objet : [EXT] Re: [DISCUSS] KIP-1111: Enforcing Explicit Naming for
> >> Kafka Streams Internal Topics
> >>>
> >>> Warning External sender Do not click on any links or open any
> >> attachments unless you trust the sender and know the content is safe.
> >>>
> >>> Thanks for the KIP. Overall this does sound useful.
> >>>
> >>> About Almog's comments
> >>>
> >>> (A1) -- I don't think that `topics.internal.require.explicit.naming`
> >>> would be a good name, as we use `topic.` prefix for actual topic
> >>> configs. Thus, KS would pickup `internal.require.explicit.naming` and
> >>> try to apply is as topic config what would either crash or log a
> >>> spurious WARN what would be annoying?
> >>>
> >>> (A2) -- We would need to check the code if such a strategy would work as
> >>> expected? If users pass in a name for a previously un-named topic, we
> >>> might get an cascading index shift which might be undesired (and could
> >>> "break" processor level metrics)?
> >>>
> >>> But I am also not sure if we actually need a migration path? If might be
> >>> ok if this new feature only work for new deployments? (Or use can re-set
> >>> their application.)
> >>>
> >>>
> >>> MJS-1: I have another question though about the risk such a config
> >>> implies? Given that configs are often managed from "outside", one could
> >>> easily break an exiting application which does not have this config
> >>> enable, by enable the config. Of course, we already have similar config
> >>> which are equally dangerous; however, most people don't like that one
> >>> need to pass in the config into `StreamsBuilder.build(configs)` method
> >>> anyway, so maybe we could make a first step to get rid of this via this
> >> KIP?
> >>>
> >>> Thus, I am wondering if a config is actually the right way to go? Should
> >>> we instead make it feature of `StreamsBuilder` that one can enable
> >>> programmatically? Not 100% how we would do this, but maybe we could use
> >>> a builder pattern (to allow us to add similar thing in the future), and
> >>> deprecate the current constructor of `StreamsBuilder`?
> >>>
> >>> // new way to build a `StreamsBuilder`
> >>> StreamsBuilder builder = StreamsBuilder.build();
> >>>
> >>> // if we don't extend the scope of this KIP, we might also need:
> >>> StreamsBuilder builder = StreamsBuilder.build(Properties);
> >>>
> >>> Or we do extend this KIP and deprecate existing config (like topology
> >>> optimization) if favor of the new builder pattern:
> >>>
> >>>
> >>> // enable the new feature
> >>> StreamsBuilder builder = StreamsBuilder.requireExplicitNaming().build();
> >>>
> >>> // or
> >>> StreamsBuilder builder = StreamsBuilder.disableNameGeneration().build();
> >>>
> >>>
> >>> Thoughts? I am not 100% sure if this is a good idea or not, but thought
> >>> it cannot hurt to throw it out.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> This email was screened for spam and malicious content but exercise
> >> caution anyway.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 11/18/24 12:57 PM, Almog Gavra wrote:
> >>>> Hi Sebastien,
> >>>>
> >>>> Thanks for the KIP! In general, I'm a fan of giving users the tools they
> >>>> need to protect their organization so I'm supportive of this proposal. A
> >>>> few nits and comments:
> >>>>
> >>>> A1. [nit] consider 'topics.internal.require.explicit.naming' so that
> >> (a) we
> >>>> can group anything else we introduce for "topics.internal" with the same
> >>>> prefix and (b) it's not a double negative (don't disallow is the
> >> default,
> >>>> instead of don't require).
> >>>> A2. I think we can improve on the implementation by making it typesafe
> >>>> instead of checking whether the topic matches some pattern. I think a
> >>>> migration path for users that want to turn this flag on, but already
> >> have
> >>>> some auto-generated names, is to manually specify the auto-generated
> >> names
> >>>> for preexisting topics. This would enforce future topics naming, but not
> >>>> penalize them for having used auto generated names in the past. This
> >> makes
> >>>> the implementation a little more challenging, but I think it's
> >> worthwhile.
> >>>>
> >>>> Cheers,
> >>>> Almog
> >>>>
> >>>> On Fri, Nov 15, 2024 at 12:20 AM Sebastien Viale <
> >>>> sebastien.vi...@michelin.com> wrote:
> >>>>
> >>>>> Hi Everyone,
> >>>>>
> >>>>> I would like to start a discussion on KIP-1111: Enforcing Explicit
> >> Naming
> >>>>> for Kafka Streams Internal Topics<
> >>>>>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics><https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics>>
> >> <
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics><https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1111%3A+Enforcing+Explicit+Naming+for+Kafka+Streams+Internal+Topics>>
> >>>
> >>>>>>
> >>>>> This proposal aims to add a configuration that prevents a Kafka Streams
> >>>>> application from starting if any of its internal topics have
> >> auto-generated
> >>>>> names.
> >>>>> Regards,
> >>>>>
> >>>>> Sébastien
> >>>>>
> >>>>
> >>

Re: [DISCUSS] KIP-1111: Enforcing Explicit Naming for Kafka Streams Internal Topics

Reply via email to