Thank you everyone for your feedback.

The new revision of KIP is ready for review. Notable changes since last
revision:
1. Removed change to log.message.timestamp.before.max.ms
2. Modified default for num.recovery.threads.per.data.dir
3. Added new proposals for num.io.threads and num.network.threads
4. Improved justification for changes and made motivation more explicit


*Federico*

Good point. I have updated the compatibility section to include information
on how the new values will be handled during the rolling upgrade to 4.0.

*Jun*

Thank you for pointing out the use case. It correlated with my experience
where most of the problems arise from users accidentally setting a
timestamp in ns instead of ms, and hence, leading to a future time. I have
reverted changes to log.message.timestamp.after.max.ms

*Tommi*

You have a valid point about the friction this could cause during upgrade.
This is a risk we are accepting with this KIP. The risk is acceptable
because we expect due diligence from the users while transitioning to a new
major version and also because, there aren't legitimate use cases to set
these values to a threshold where they will cause system instability.

I have added the failure mode for invalid values in the compatibility
section.

About your suggestion to apply new changes but not fail upgrade, I am a bit
wary about silently modifying configuration that user has explicitly set.
The strategy also won't work for cases where the user uses a 3P tool to
reset configurations. Failing fast on invalid configuration is the best
option. To improve user experience I have added a suggestion in the KIP to
add a "upgrade validation tool" which a user can run to verify
compatibility before actual upgrade. It won't be part of this KIP but
something we might want to do to ease migration to 4.0. I will float the
idea around in the community and see if we have volunteers for this.

*Greg*

Thanks for raising this concern. We indeed intentionally use small values
to "force" a segment roll sometimes. However, I believe that there are
alternate ways to accomplish that in a test environment. As an example, if
I wanted to create small segments for a test scenario, I would either write
an integration test where I can call roll explicitly OR send a message with
create time in future (roll is based on timestamp of last appended message)
OR just write more data (1MB minimum today).

But to your larger point, I agree that there is value in providing two
different guardrails for the users. First, a "user-friendly" guardrails
which will be an opinionated guardrail meant to protect the cluster from
inadvertent sub-optimal and incorrect configuration. Users who do not wish
to spend time in learning Kafka intricacies and the effect of myriad
configurations can choose this option. Second, a "advanced user" guardrail
which will allow users to push the cluster to it's boundaries with full
knowledge of the repercussions and fallout that may ensue.

As a project, which is trusted by users across this broad spectrum of
expertise, I do believe that we need to provide user-friendly mechanisms
for different segments for users, but that is something I am unwilling to
take up in this KIP. If you feel passionate about addressing this feature
gap in Apache Kafka, please feel free to start a discussion on this topic
and I will promise to be an active participant in it. The outcome could be
as simple as a documentation where we provide ideal configuration for
different workloads.

*Luke*

I agree with your suggestion to set the number of threads based on machine
cores. In fact, I would promote a similar solution for network threads and
request handler threads as well (as an example, I have seen users setting
large values for network threads compared to request handler threads,
resulting in accepting more requests that brokers can handle, leading to
degradation. A constraint that network threads <= request handler threads
could prevent such scenario).

Earlier, I didn't add these since these are not dead-obvious changes
compared to the rest of the changes in the KIP, but I have now added them.
It's worth discussing them.

--
Divij Vaidya



On Wed, Nov 20, 2024 at 11:01 AM Luke Chen <show...@gmail.com> wrote:

> Hi Divij,
>
> One comment:
> For the default value of "num.recovery.threads.per.data.dir", we change
> from 1 to 2.
> Could we change the value based on the CPU core number?
> Ex: default value = CPU core num / num of log dirs.
>
> WDYT?
>
> Luke
>
>
> On Wed, Nov 20, 2024 at 5:53 PM Tommi Vainikainen
> <tvain...@aiven.io.invalid>
> wrote:
>
> > Hi,
> >
> > This proposal looks good to me with new defaults and constraints.
> >
> > About breaking backward compatibility I've question on how you see it
> best
> > handled in case someone upgrades Kafka cluster from 3.9 to 4.0 with let's
> > say one topic having segment.bytes=500 kb (new min is 1 MB). Does it mean
> > that the Kafka process fails to start with the ConfigException, or when
> is
> > the ConfigException thrown? Would it be better to simply start behaving
> > like it is at least a new minimum regardless of topic setting, and then
> on
> > the fly fix the config instead? That would be much more convenient for
> most
> > Kafka users in my opinion.
> >
> > On Tue, Nov 19, 2024 at 12:55 PM Divij Vaidya <divijvaidy...@gmail.com>
> > wrote:
> >
> > > KT1 - That is right. We will throw a ConfigException. That is why this
> > > change is considered backward incompatible. To be honest, given the
> > nature
> > > of suggested changes, I don't see any valid use case on why a user may
> > have
> > > a value which will be invalid after the new constraints.
> > >
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Tue, Nov 19, 2024 at 2:21 AM Kirk True <k...@kirktrue.pro> wrote:
> > >
> > > > Hi Divij,
> > > >
> > > > Thanks for the KIP!
> > > >
> > > > My only question:
> > > >
> > > > KT1. In the case where we change the constraints so that a user's
> > > > previously valid configuration is now invalid, do we do anything
> other
> > > than
> > > > throw a ConfigException?
> > > >
> > > > Thanks,
> > > > Kirk
> > > >
> > > > On Mon, Nov 18, 2024, at 2:13 AM, Divij Vaidya wrote:
> > > > > Hey folks
> > > > >
> > > > > With 4.0, we have an opportunity to reset the default values and
> add
> > > > > constraints in the configurations based on our learnings since 3.0.
> > > > >
> > > > > Here's a KIP which modifies defaults for some properties and
> modifies
> > > the
> > > > > constraints for a few others.
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations
> > > > >
> > > > >
> > > > > Looking forward for your feedback.
> > > > >
> > > > > (Previous discussion thread on this topic -
> > > > > https://lists.apache.org/thread/3dx9mdmsqf8pko9xdmhks80k96g650zp )
> > > > >
> > > >
> > >
> >
>

Reply via email to