Re: [VOTE} KIP-733: change Kafka Streams default replication factor config

Sophie Blee-Goldman Fri, 23 Apr 2021 19:12:52 -0700

That's a good summary of the situation in my mind. Definitely +1 to
highlighting each of these
specifically in the KIP


On Fri, Apr 23, 2021 at 6:19 PM Ismael Juma <ism...@juma.me.uk> wrote:

> Thanks for the comments. Is the following fair?
>
> 1. It's good that upgrades are not affected under normal circumstances and
> it mitigates the compatibility impact greatly.
> 2. We're saying that the previous behavior of defaulting to 1 is dangerous
> and we don't want to fallback to that.
> 3. The main motivation for the previous default was to make the quickstart
> easy and we can achieve that with the new default.
> 4. This change raises the minimum Kafka version required by Kafka Streams
> with the default configuration. We should highlight that and update the
> docs accordingly.
> 5. Kafka 2.3.x clusters are not that uncommon, but we're ok with users
> having to set this config explicitly when deploying new applications in
> such clusters.
>
> If so, I'm fine with the KIP, but I think we should update it to cover the
> points above (some may already be covered).
>
> Ismael
>
> On Tue, Apr 20, 2021 at 12:59 PM Sophie Blee-Goldman
> <sop...@confluent.io.invalid> wrote:
>
> > FWIW if someone is deploying a new Streams application to production and
> > they don't
> > set this config, either because they are not reading the docs or they
> just
> > forgot, then imo
> > it's actually better for them to hit this exception and fail-fast if the
> > brokers don't support it.
> >
> > Currently, the default value is 1, which means if they forget to set this
> > config then they're
> > running a fragile application in production with potential for data loss.
> > Setting it to -1 will
> > mean in most cases the appropriate value for that environment is chosen.
> > And in the subset
> > of applications where the user has forgotten to set this config and are
> > running against older
> > brokers, better to fail-fast and alert the user to set this config then
> to
> > silently downgrade and
> > pick a value that might be completely wrong for that environment.
> >
> > I don't necessarily agree that 2.3 is so old that we can assume most
> > clusters will understand
> > this value, my experience has very much been otherwise. However I do
> think
> > it's likely to be
> > quite rare for this to impact any users on upgrade. For that to happen,
> > they would need to be
> > on older brokers and running an application with the default replication
> > value in their production
> > environment, AND they would need to have deleted their internal topics
> for
> > some reason prior
> > to the upgrade. Even if all of those things are true, then (1) if the
> > internal topics were deleted
> > then they have essentially reset their app, which means this case is
> really
> > more like starting up
> > a new app than upgrading a live one, and (2) they probably do not mean to
> > run with this setting
> > in production in the first place, and they have an opportunity to fix it
> > when they recreate the topics
> >
> > Just my 2 cents
> >
> >
> > On Mon, Apr 19, 2021 at 8:54 PM Matthias J. Sax <mj...@apache.org>
> wrote:
> >
> > > Thanks Ismael,
> > >
> > > just to clarify, any app that is running and is regularly upgraded is
> > > not affected: As pointed out in the KIP, if there is a running app that
> > > does not overwrite the default, and the app is upgraded, it won't be
> > > affected because the repartition/changelog topics already exist.
> > >
> > > Second, as the default value is currently 1, for production application
> > > I would assume that most people change the default anyway and thus
> won't
> > > be affected either. Thus, if one deploys a new production app, or
> resets
> > > a production app with reasonable config overwrites, they won't be
> > affected.
> > >
> > > Hence, the impact on production deployment should basically be
> > > non-existent, and for those rare cases were user would be affected it
> > > should be just a small hick-up because they would either deploy a new
> > > application (maybe annoying but no harm done; and actually, they should
> > > detect in staging) or would have reset an existing app for data
> > > reprocessing, ie, they did some manual cleanup and might hit this
> corner
> > > case only when re-deploying an previously stopped app, ie, an app that
> > > is currently offline anyway. Again, no harm done, just some delay to
> > > redeploy the app.
> > >
> > > Also, we point the change out in the upgrade notes. Obviously, not
> > > everybody might read them, but in the end there are multiple guards and
> > > if a user really has issues, it sounds like a very rare corner case.
> > >
> > > Last, while it might be possible to do so, the question is really,
> which
> > > value should we pick?
> > >
> > > Thus, overall I personally don't see any need to cover this case
> > > automatically. It might not be too hard to write code and handle this
> > > case, but I would rather keep it simple as it should really not impact
> > > users in any critical way.
> > >
> > >
> > > Thoughts?
> > >
> > >
> > > -Matthias
> > >
> > > On 4/18/21 8:00 PM, Ismael Juma wrote:
> > > > Is the following accurate? Do we have data that not many users would
> be
> > > > affected?
> > > >
> > > > "We also believe that 2.3.0 broker a sufficiently old such that not
> too
> > > > many user may be affected"
> > > >
> > > > I wonder if we should fallback to an actual value if the brokers are
> > > 2.3.x
> > > > or older.
> > > >
> > > > Ismael
> > > >
> > >
> >
>

Re: [VOTE} KIP-733: change Kafka Streams default replication factor config

Reply via email to