Re: [DISCUSS] KIP-1030: Change constraints and default values for various configurations

Kamal Chandraprakash Thu, 21 Nov 2024 05:51:21 -0800

Hi Divij,

I've updated the KIP to change the default value for configs #11 and #12.


On Thu, Nov 21, 2024 at 6:25 AM Jun Rao <j...@confluent.io.invalid> wrote:

> Hi, Divij,
>
> Thanks for the updated KIP.
>
> Regarding
> num.recovery.threads.per.data.dir/num.network.threads/num.io.threads, I
> have a few concerns.
> 1. It seems that we can't use floor since this can result in a value of 0,
> which is invalid.
> 2. I am not sure that the new default value is actually better. For
> example, if the recovery is I/O bound, we actually want to have more
> threads than the number of cores.
> 3. It's now hard for a user to know the actual default value.
> 4. It may also be a surprise that the default value changes when a user
> adds a listener or a dir.
>
> Thanks,
>
> Jun
>
> On Wed, Nov 20, 2024 at 7:36 AM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi Divij,
> >
> > We would like to change the default value for the below remote log
> > configurations:
> >
> > 1. remote.log.manager.copier.thread.pool.size -- default value from -1 to
> > 10
> > 2. remote.log.manager.expiration.thread.pool.size -- default value from
> -1
> > to 10
> >
> > In KIP-950
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement#KIP950:TieredStorageDisablement-Disablement-ZookeeperBackedCluster
> > >,
> > it was proposed to split the thread pool that was used by leader and
> > follower into copy and expiration thread pools. With KIP-956
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas
> > >,
> > we did this change in v3.9.
> > The default value for the copier and expiration thread pools was set to
> -1
> > to derive the
> > value from the previous thread pool for backward compatibility. See
> > PR#17499
> > <https://github.com/apache/kafka/pull/17499/#discussion_r1804119567>.
> >
> > Now, we have three thread pools:
> >
> > 1. remote.log.manager.thread.pool.size -- this is used by replica to
> fetch
> > the highest-uploaded remote offset for the follower partitions and won't
> be
> > deprecated as proposed.
> > 2. remote.log.manager.copier.thread.pool.size -- this is used by leader
> to
> > upload the log segments to remote.
> > 3. remote.log.manager.expiration.thread.pool.size -- this is used by
> leader
> > to delete the expired log segments from remote.
> >
> > A broker acts as both the leader for some partitions and as follower for
> > other partitions,
> > deriving the value from `remote.log.manager.thread.pool.size` for copier
> > and expiration
> > thread pool looks incorrect to me. The task of the follower is
> light-weight
> > to fetch only
> > the highest-uploaded remote offset whereas the task of the leader is
> heavy
> > to copy and
> > delete the segments. Moreover, the change didn't ensure backward
> > compatibility, if the
> > user was running with 10 threads in v3.8. In v3.9, the server will start
> 30
> > threads.
> >
> > So, I propose to change the default value of copier and expiration thread
> > pools from -1 to 10.
> > This will also ensure smooth validation while making those configs as
> > dynamic.
> >
> > Thanks,
> > Kamal
> >
> > On Wed, Nov 20, 2024 at 5:41 PM Divij Vaidya <divijvaidy...@gmail.com>
> > wrote:
> >
> > > Thank you everyone for your feedback.
> > >
> > > The new revision of KIP is ready for review. Notable changes since last
> > > revision:
> > > 1. Removed change to log.message.timestamp.before.max.ms
> > > 2. Modified default for num.recovery.threads.per.data.dir
> > > 3. Added new proposals for num.io.threads and num.network.threads
> > > 4. Improved justification for changes and made motivation more explicit
> > >
> > >
> > > *Federico*
> > >
> > > Good point. I have updated the compatibility section to include
> > information
> > > on how the new values will be handled during the rolling upgrade to
> 4.0.
> > >
> > > *Jun*
> > >
> > > Thank you for pointing out the use case. It correlated with my
> experience
> > > where most of the problems arise from users accidentally setting a
> > > timestamp in ns instead of ms, and hence, leading to a future time. I
> > have
> > > reverted changes to log.message.timestamp.after.max.ms
> > >
> > > *Tommi*
> > >
> > > You have a valid point about the friction this could cause during
> > upgrade.
> > > This is a risk we are accepting with this KIP. The risk is acceptable
> > > because we expect due diligence from the users while transitioning to a
> > new
> > > major version and also because, there aren't legitimate use cases to
> set
> > > these values to a threshold where they will cause system instability.
> > >
> > > I have added the failure mode for invalid values in the compatibility
> > > section.
> > >
> > > About your suggestion to apply new changes but not fail upgrade, I am a
> > bit
> > > wary about silently modifying configuration that user has explicitly
> set.
> > > The strategy also won't work for cases where the user uses a 3P tool to
> > > reset configurations. Failing fast on invalid configuration is the best
> > > option. To improve user experience I have added a suggestion in the KIP
> > to
> > > add a "upgrade validation tool" which a user can run to verify
> > > compatibility before actual upgrade. It won't be part of this KIP but
> > > something we might want to do to ease migration to 4.0. I will float
> the
> > > idea around in the community and see if we have volunteers for this.
> > >
> > > *Greg*
> > >
> > > Thanks for raising this concern. We indeed intentionally use small
> values
> > > to "force" a segment roll sometimes. However, I believe that there are
> > > alternate ways to accomplish that in a test environment. As an example,
> > if
> > > I wanted to create small segments for a test scenario, I would either
> > write
> > > an integration test where I can call roll explicitly OR send a message
> > with
> > > create time in future (roll is based on timestamp of last appended
> > message)
> > > OR just write more data (1MB minimum today).
> > >
> > > But to your larger point, I agree that there is value in providing two
> > > different guardrails for the users. First, a "user-friendly" guardrails
> > > which will be an opinionated guardrail meant to protect the cluster
> from
> > > inadvertent sub-optimal and incorrect configuration. Users who do not
> > wish
> > > to spend time in learning Kafka intricacies and the effect of myriad
> > > configurations can choose this option. Second, a "advanced user"
> > guardrail
> > > which will allow users to push the cluster to it's boundaries with full
> > > knowledge of the repercussions and fallout that may ensue.
> > >
> > > As a project, which is trusted by users across this broad spectrum of
> > > expertise, I do believe that we need to provide user-friendly
> mechanisms
> > > for different segments for users, but that is something I am unwilling
> to
> > > take up in this KIP. If you feel passionate about addressing this
> feature
> > > gap in Apache Kafka, please feel free to start a discussion on this
> topic
> > > and I will promise to be an active participant in it. The outcome could
> > be
> > > as simple as a documentation where we provide ideal configuration for
> > > different workloads.
> > >
> > > *Luke*
> > >
> > > I agree with your suggestion to set the number of threads based on
> > machine
> > > cores. In fact, I would promote a similar solution for network threads
> > and
> > > request handler threads as well (as an example, I have seen users
> setting
> > > large values for network threads compared to request handler threads,
> > > resulting in accepting more requests that brokers can handle, leading
> to
> > > degradation. A constraint that network threads <= request handler
> threads
> > > could prevent such scenario).
> > >
> > > Earlier, I didn't add these since these are not dead-obvious changes
> > > compared to the rest of the changes in the KIP, but I have now added
> > them.
> > > It's worth discussing them.
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Wed, Nov 20, 2024 at 11:01 AM Luke Chen <show...@gmail.com> wrote:
> > >
> > > > Hi Divij,
> > > >
> > > > One comment:
> > > > For the default value of "num.recovery.threads.per.data.dir", we
> change
> > > > from 1 to 2.
> > > > Could we change the value based on the CPU core number?
> > > > Ex: default value = CPU core num / num of log dirs.
> > > >
> > > > WDYT?
> > > >
> > > > Luke
> > > >
> > > >
> > > > On Wed, Nov 20, 2024 at 5:53 PM Tommi Vainikainen
> > > > <tvain...@aiven.io.invalid>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > This proposal looks good to me with new defaults and constraints.
> > > > >
> > > > > About breaking backward compatibility I've question on how you see
> it
> > > > best
> > > > > handled in case someone upgrades Kafka cluster from 3.9 to 4.0 with
> > > let's
> > > > > say one topic having segment.bytes=500 kb (new min is 1 MB). Does
> it
> > > mean
> > > > > that the Kafka process fails to start with the ConfigException, or
> > when
> > > > is
> > > > > the ConfigException thrown? Would it be better to simply start
> > behaving
> > > > > like it is at least a new minimum regardless of topic setting, and
> > then
> > > > on
> > > > > the fly fix the config instead? That would be much more convenient
> > for
> > > > most
> > > > > Kafka users in my opinion.
> > > > >
> > > > > On Tue, Nov 19, 2024 at 12:55 PM Divij Vaidya <
> > divijvaidy...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > KT1 - That is right. We will throw a ConfigException. That is why
> > > this
> > > > > > change is considered backward incompatible. To be honest, given
> the
> > > > > nature
> > > > > > of suggested changes, I don't see any valid use case on why a
> user
> > > may
> > > > > have
> > > > > > a value which will be invalid after the new constraints.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Divij Vaidya
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Nov 19, 2024 at 2:21 AM Kirk True <k...@kirktrue.pro>
> > wrote:
> > > > > >
> > > > > > > Hi Divij,
> > > > > > >
> > > > > > > Thanks for the KIP!
> > > > > > >
> > > > > > > My only question:
> > > > > > >
> > > > > > > KT1. In the case where we change the constraints so that a
> user's
> > > > > > > previously valid configuration is now invalid, do we do
> anything
> > > > other
> > > > > > than
> > > > > > > throw a ConfigException?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Kirk
> > > > > > >
> > > > > > > On Mon, Nov 18, 2024, at 2:13 AM, Divij Vaidya wrote:
> > > > > > > > Hey folks
> > > > > > > >
> > > > > > > > With 4.0, we have an opportunity to reset the default values
> > and
> > > > add
> > > > > > > > constraints in the configurations based on our learnings
> since
> > > 3.0.
> > > > > > > >
> > > > > > > > Here's a KIP which modifies defaults for some properties and
> > > > modifies
> > > > > > the
> > > > > > > > constraints for a few others.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations
> > > > > > > >
> > > > > > > >
> > > > > > > > Looking forward for your feedback.
> > > > > > > >
> > > > > > > > (Previous discussion thread on this topic -
> > > > > > > >
> > https://lists.apache.org/thread/3dx9mdmsqf8pko9xdmhks80k96g650zp
> > > )
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-1030: Change constraints and default values for various configurations

Reply via email to