Thanks for the KIP, Bruno. (1) To add my 2ct to the previous discussion: While I am a big advocate for simplicity (what what always a strength of Kafka Streams), I think that making `init()` mandatory would be ok, because for people that get started, it is just one more line of code. In contrast to the example from Sophie about input topics: it's more work to actually create input topic and thus, I don't think it would be too much of a problem.
Nevertheless, after 4 years of having apps in production with Kafka Streams, it also seems not to be too big of a risk for many use cases and thus using auto-topic creating seems to be a working solution for many use cases. Thus, I tend to think we should keep both long term. (2) About `MissingInternalTopicException`: should we add a getter method to expose which topics are missing and/or have incorrect number of partitions (based on Guozhang's comment on the voting thread)? (3) Should we allow people to change configs of those topic if the already exist using `init()`? Cf: - https://issues.apache.org/jira/browse/KAFKA-7803 - https://issues.apache.org/jira/browse/KAFKA-7591 This might be out-of-scope for the KIP, but I thought I bring it up anyway. Thoughts? (4) I agree that adding an overload that takes a timeout is a good idea. (5) Should `init()` throw an exception if _any_ internal topic exists already? You propose that it would not throw if _all_ internal topic exist but I think we should throw for this case, too? Otherwise, user could write code to calls `init()` unconditionally, and thus the guard we try to establish would be void (as it could happen that _all_ internal topics got deleted, especially if there is only a single one; and this case would not be detected). Nit: can we document the exception type in the public API change code snippet? (5b) The comment on `initParameters()` is confusing: > // specifies to disable all setup of internal topics It sound like if internal topic are not created at all. (5c) Why do we need a getter on `InitParameters`? (6) Migration: > Migrating from the current behavior to INTERNAL_TOPIC_SETUP set to > MANUAL_SETUP can be done without any specific migration plan. Users need to > set INTERNAL_TOPIC_SETUP to MANUAL_SETUP. Add: "and need to change their code to call `init()` accordingly." -Matthias On 12/11/20 6:51 AM, John Roesler wrote: > Thanks Bruno, > > It’s your decision. I was thinking that it’s appropriate to mention such > discussed and rejected/deferred items under “Rejected Alternatives” so that > future generations will know whether we even thought of it and why we didn’t > do it right now. > > Thanks, > John > > On Fri, Dec 11, 2020, at 04:00, Bruno Cadonna wrote: >> Thanks Sophie for the feedback! >> >> I agree with you on not logging a warning for a decision, we haven't >> taken and on logging a warning for possible data loss with automatic >> initialization. I would say this is a implementation detail that we can >> further discuss on the PR. >> >> Regarding the parameter class, I followed the grammar here: >> >> https://cwiki.apache.org/confluence/x/Lw6dC >> >> We've already used it for other KIPs and I thought we agreed to use it >> for new parameter classes. >> >> >> John, >> >> I am in favor of not mentioning a possible future decision about >> dismissing automatic initialization in this KIP. I think, we should >> discuss and decide on the dismissal with a separate KIP. >> >> Best, >> Bruno >> >> >> On 10.12.20 20:48, Sophie Blee-Goldman wrote: >>> Hey John, >>> >>> I think we should avoid logging a warning that implies we've committed >>> to changing a default unless we've absolutely committed to it, which it >>> sounds like we have not (fwiw I'm also on the fence, but leaning towards >>> leaving it automatic -- just think of how many people already forget to >>> create their source topics before startup and struggle with that). This is >>> probably part of a larger discussion on whether to default to OOTB-friendly >>> or production-ready settings, which should probably be considered >>> holistically >>> rather than on a case-by-case basis. >>> >>> That said, I'm totally down with logging a warning that data loss is >>> possible >>> when using automatic initialization, if that's what you meant. >>> >>> Bruno, >>> >>> Thanks for the KIP, it looks good in general but I'm wondering if we can >>> make >>> the InitParameters API a bit more aligned to the config/parameter classes >>> used >>> throughout Streams (eg Materialized). >>> >>> For example something like >>> >>> public class Initialized { >>> >>> public static withSetupInternalTopicsIfIncompleteEnabled(); >>> public static withSetupInternalTopicsIfIncompleteDisabled(); >>> // we also don't tend to have getters for these kind of classes, >>> but maybe we should start :) >>> } >>> >>> >>> On Thu, Dec 10, 2020 at 9:33 AM John Roesler <vvcep...@apache.org> wrote: >>> >>>> Thanks, Bruno, >>>> >>>> I think my feelings are the same as yours. It seems like >>>> either call is just a matter of some forward looking >>>> statement in the KIP and maybe a warning log if we're >>>> leaning toward changing the default in the future. >>>> >>>> I'm happy with whatever you prefer. >>>> >>>> Thanks again, >>>> -John >>>> >>>> On Thu, 2020-12-10 at 16:37 +0100, Bruno Cadonna wrote: >>>>> Hi John, >>>>> >>>>> Thank you for the feedback! >>>>> >>>>> I am undecided, because while manual init only makes Kafka Streams safer >>>>> regarding data loss, it makes first toy apps with Kafka Streams a little >>>>> bit more complicated. I am a bit more inclined to manual init only, >>>> though. >>>>> >>>>> Best, >>>>> Bruno >>>>> >>>>> >>>>> On 10.12.20 15:20, John Roesler wrote: >>>>>> Hi Bruno, >>>>>> >>>>>> Thanks for the KIP! >>>>>> >>>>>> This seems like a nice data integrity improvement, and the KIP looks >>>> good to me. >>>>>> >>>>>> I’m wondering if we should plan to transition to manual init only in >>>> the future. I.e. maybe we log a warning, then later on we switch the >>>> default config to manual, and then ultimately drop the config completely. >>>> What do you think? >>>>>> >>>>>> Thanks, >>>>>> John >>>>>> >>>>>> On Thu, Dec 10, 2020, at 04:36, Bruno Cadonna wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I'd like to start the discussion on KIP-698 that proposes an explicit >>>>>>> user initialization of broker-side state for Kafka Streams instead of >>>>>>> letting Kafka Streams setting up the broker-side state automatically >>>>>>> during rebalance. Such an explicit initialization avoids possible >>>> data >>>>>>> loss issues due to automatic initialization. >>>>>>> >>>>>>> https://cwiki.apache.org/confluence/x/7CnZCQ >>>>>>> >>>>>>> Best, >>>>>>> Bruno >>>>>>> >>>> >>>> >>>> >>> >>