Thanks for the comments. Is the following fair?

1. It's good that upgrades are not affected under normal circumstances and
it mitigates the compatibility impact greatly.
2. We're saying that the previous behavior of defaulting to 1 is dangerous
and we don't want to fallback to that.
3. The main motivation for the previous default was to make the quickstart
easy and we can achieve that with the new default.
4. This change raises the minimum Kafka version required by Kafka Streams
with the default configuration. We should highlight that and update the
docs accordingly.
5. Kafka 2.3.x clusters are not that uncommon, but we're ok with users
having to set this config explicitly when deploying new applications in
such clusters.

If so, I'm fine with the KIP, but I think we should update it to cover the
points above (some may already be covered).

Ismael

On Tue, Apr 20, 2021 at 12:59 PM Sophie Blee-Goldman
<sop...@confluent.io.invalid> wrote:

> FWIW if someone is deploying a new Streams application to production and
> they don't
> set this config, either because they are not reading the docs or they just
> forgot, then imo
> it's actually better for them to hit this exception and fail-fast if the
> brokers don't support it.
>
> Currently, the default value is 1, which means if they forget to set this
> config then they're
> running a fragile application in production with potential for data loss.
> Setting it to -1 will
> mean in most cases the appropriate value for that environment is chosen.
> And in the subset
> of applications where the user has forgotten to set this config and are
> running against older
> brokers, better to fail-fast and alert the user to set this config then to
> silently downgrade and
> pick a value that might be completely wrong for that environment.
>
> I don't necessarily agree that 2.3 is so old that we can assume most
> clusters will understand
> this value, my experience has very much been otherwise. However I do think
> it's likely to be
> quite rare for this to impact any users on upgrade. For that to happen,
> they would need to be
> on older brokers and running an application with the default replication
> value in their production
> environment, AND they would need to have deleted their internal topics for
> some reason prior
> to the upgrade. Even if all of those things are true, then (1) if the
> internal topics were deleted
> then they have essentially reset their app, which means this case is really
> more like starting up
> a new app than upgrading a live one, and (2) they probably do not mean to
> run with this setting
> in production in the first place, and they have an opportunity to fix it
> when they recreate the topics
>
> Just my 2 cents
>
>
> On Mon, Apr 19, 2021 at 8:54 PM Matthias J. Sax <mj...@apache.org> wrote:
>
> > Thanks Ismael,
> >
> > just to clarify, any app that is running and is regularly upgraded is
> > not affected: As pointed out in the KIP, if there is a running app that
> > does not overwrite the default, and the app is upgraded, it won't be
> > affected because the repartition/changelog topics already exist.
> >
> > Second, as the default value is currently 1, for production application
> > I would assume that most people change the default anyway and thus won't
> > be affected either. Thus, if one deploys a new production app, or resets
> > a production app with reasonable config overwrites, they won't be
> affected.
> >
> > Hence, the impact on production deployment should basically be
> > non-existent, and for those rare cases were user would be affected it
> > should be just a small hick-up because they would either deploy a new
> > application (maybe annoying but no harm done; and actually, they should
> > detect in staging) or would have reset an existing app for data
> > reprocessing, ie, they did some manual cleanup and might hit this corner
> > case only when re-deploying an previously stopped app, ie, an app that
> > is currently offline anyway. Again, no harm done, just some delay to
> > redeploy the app.
> >
> > Also, we point the change out in the upgrade notes. Obviously, not
> > everybody might read them, but in the end there are multiple guards and
> > if a user really has issues, it sounds like a very rare corner case.
> >
> > Last, while it might be possible to do so, the question is really, which
> > value should we pick?
> >
> > Thus, overall I personally don't see any need to cover this case
> > automatically. It might not be too hard to write code and handle this
> > case, but I would rather keep it simple as it should really not impact
> > users in any critical way.
> >
> >
> > Thoughts?
> >
> >
> > -Matthias
> >
> > On 4/18/21 8:00 PM, Ismael Juma wrote:
> > > Is the following accurate? Do we have data that not many users would be
> > > affected?
> > >
> > > "We also believe that 2.3.0 broker a sufficiently old such that not too
> > > many user may be affected"
> > >
> > > I wonder if we should fallback to an actual value if the brokers are
> > 2.3.x
> > > or older.
> > >
> > > Ismael
> > >
> >
>

Reply via email to