Hi Divij, I've updated the KIP to change the default value for configs #11 and #12.
On Thu, Nov 21, 2024 at 6:25 AM Jun Rao <j...@confluent.io.invalid> wrote: > Hi, Divij, > > Thanks for the updated KIP. > > Regarding > num.recovery.threads.per.data.dir/num.network.threads/num.io.threads, I > have a few concerns. > 1. It seems that we can't use floor since this can result in a value of 0, > which is invalid. > 2. I am not sure that the new default value is actually better. For > example, if the recovery is I/O bound, we actually want to have more > threads than the number of cores. > 3. It's now hard for a user to know the actual default value. > 4. It may also be a surprise that the default value changes when a user > adds a listener or a dir. > > Thanks, > > Jun > > On Wed, Nov 20, 2024 at 7:36 AM Kamal Chandraprakash < > kamal.chandraprak...@gmail.com> wrote: > > > Hi Divij, > > > > We would like to change the default value for the below remote log > > configurations: > > > > 1. remote.log.manager.copier.thread.pool.size -- default value from -1 to > > 10 > > 2. remote.log.manager.expiration.thread.pool.size -- default value from > -1 > > to 10 > > > > In KIP-950 > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement#KIP950:TieredStorageDisablement-Disablement-ZookeeperBackedCluster > > >, > > it was proposed to split the thread pool that was used by leader and > > follower into copy and expiration thread pools. With KIP-956 > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas > > >, > > we did this change in v3.9. > > The default value for the copier and expiration thread pools was set to > -1 > > to derive the > > value from the previous thread pool for backward compatibility. See > > PR#17499 > > <https://github.com/apache/kafka/pull/17499/#discussion_r1804119567>. > > > > Now, we have three thread pools: > > > > 1. remote.log.manager.thread.pool.size -- this is used by replica to > fetch > > the highest-uploaded remote offset for the follower partitions and won't > be > > deprecated as proposed. > > 2. remote.log.manager.copier.thread.pool.size -- this is used by leader > to > > upload the log segments to remote. > > 3. remote.log.manager.expiration.thread.pool.size -- this is used by > leader > > to delete the expired log segments from remote. > > > > A broker acts as both the leader for some partitions and as follower for > > other partitions, > > deriving the value from `remote.log.manager.thread.pool.size` for copier > > and expiration > > thread pool looks incorrect to me. The task of the follower is > light-weight > > to fetch only > > the highest-uploaded remote offset whereas the task of the leader is > heavy > > to copy and > > delete the segments. Moreover, the change didn't ensure backward > > compatibility, if the > > user was running with 10 threads in v3.8. In v3.9, the server will start > 30 > > threads. > > > > So, I propose to change the default value of copier and expiration thread > > pools from -1 to 10. > > This will also ensure smooth validation while making those configs as > > dynamic. > > > > Thanks, > > Kamal > > > > On Wed, Nov 20, 2024 at 5:41 PM Divij Vaidya <divijvaidy...@gmail.com> > > wrote: > > > > > Thank you everyone for your feedback. > > > > > > The new revision of KIP is ready for review. Notable changes since last > > > revision: > > > 1. Removed change to log.message.timestamp.before.max.ms > > > 2. Modified default for num.recovery.threads.per.data.dir > > > 3. Added new proposals for num.io.threads and num.network.threads > > > 4. Improved justification for changes and made motivation more explicit > > > > > > > > > *Federico* > > > > > > Good point. I have updated the compatibility section to include > > information > > > on how the new values will be handled during the rolling upgrade to > 4.0. > > > > > > *Jun* > > > > > > Thank you for pointing out the use case. It correlated with my > experience > > > where most of the problems arise from users accidentally setting a > > > timestamp in ns instead of ms, and hence, leading to a future time. I > > have > > > reverted changes to log.message.timestamp.after.max.ms > > > > > > *Tommi* > > > > > > You have a valid point about the friction this could cause during > > upgrade. > > > This is a risk we are accepting with this KIP. The risk is acceptable > > > because we expect due diligence from the users while transitioning to a > > new > > > major version and also because, there aren't legitimate use cases to > set > > > these values to a threshold where they will cause system instability. > > > > > > I have added the failure mode for invalid values in the compatibility > > > section. > > > > > > About your suggestion to apply new changes but not fail upgrade, I am a > > bit > > > wary about silently modifying configuration that user has explicitly > set. > > > The strategy also won't work for cases where the user uses a 3P tool to > > > reset configurations. Failing fast on invalid configuration is the best > > > option. To improve user experience I have added a suggestion in the KIP > > to > > > add a "upgrade validation tool" which a user can run to verify > > > compatibility before actual upgrade. It won't be part of this KIP but > > > something we might want to do to ease migration to 4.0. I will float > the > > > idea around in the community and see if we have volunteers for this. > > > > > > *Greg* > > > > > > Thanks for raising this concern. We indeed intentionally use small > values > > > to "force" a segment roll sometimes. However, I believe that there are > > > alternate ways to accomplish that in a test environment. As an example, > > if > > > I wanted to create small segments for a test scenario, I would either > > write > > > an integration test where I can call roll explicitly OR send a message > > with > > > create time in future (roll is based on timestamp of last appended > > message) > > > OR just write more data (1MB minimum today). > > > > > > But to your larger point, I agree that there is value in providing two > > > different guardrails for the users. First, a "user-friendly" guardrails > > > which will be an opinionated guardrail meant to protect the cluster > from > > > inadvertent sub-optimal and incorrect configuration. Users who do not > > wish > > > to spend time in learning Kafka intricacies and the effect of myriad > > > configurations can choose this option. Second, a "advanced user" > > guardrail > > > which will allow users to push the cluster to it's boundaries with full > > > knowledge of the repercussions and fallout that may ensue. > > > > > > As a project, which is trusted by users across this broad spectrum of > > > expertise, I do believe that we need to provide user-friendly > mechanisms > > > for different segments for users, but that is something I am unwilling > to > > > take up in this KIP. If you feel passionate about addressing this > feature > > > gap in Apache Kafka, please feel free to start a discussion on this > topic > > > and I will promise to be an active participant in it. The outcome could > > be > > > as simple as a documentation where we provide ideal configuration for > > > different workloads. > > > > > > *Luke* > > > > > > I agree with your suggestion to set the number of threads based on > > machine > > > cores. In fact, I would promote a similar solution for network threads > > and > > > request handler threads as well (as an example, I have seen users > setting > > > large values for network threads compared to request handler threads, > > > resulting in accepting more requests that brokers can handle, leading > to > > > degradation. A constraint that network threads <= request handler > threads > > > could prevent such scenario). > > > > > > Earlier, I didn't add these since these are not dead-obvious changes > > > compared to the rest of the changes in the KIP, but I have now added > > them. > > > It's worth discussing them. > > > > > > -- > > > Divij Vaidya > > > > > > > > > > > > On Wed, Nov 20, 2024 at 11:01 AM Luke Chen <show...@gmail.com> wrote: > > > > > > > Hi Divij, > > > > > > > > One comment: > > > > For the default value of "num.recovery.threads.per.data.dir", we > change > > > > from 1 to 2. > > > > Could we change the value based on the CPU core number? > > > > Ex: default value = CPU core num / num of log dirs. > > > > > > > > WDYT? > > > > > > > > Luke > > > > > > > > > > > > On Wed, Nov 20, 2024 at 5:53 PM Tommi Vainikainen > > > > <tvain...@aiven.io.invalid> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > This proposal looks good to me with new defaults and constraints. > > > > > > > > > > About breaking backward compatibility I've question on how you see > it > > > > best > > > > > handled in case someone upgrades Kafka cluster from 3.9 to 4.0 with > > > let's > > > > > say one topic having segment.bytes=500 kb (new min is 1 MB). Does > it > > > mean > > > > > that the Kafka process fails to start with the ConfigException, or > > when > > > > is > > > > > the ConfigException thrown? Would it be better to simply start > > behaving > > > > > like it is at least a new minimum regardless of topic setting, and > > then > > > > on > > > > > the fly fix the config instead? That would be much more convenient > > for > > > > most > > > > > Kafka users in my opinion. > > > > > > > > > > On Tue, Nov 19, 2024 at 12:55 PM Divij Vaidya < > > divijvaidy...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > KT1 - That is right. We will throw a ConfigException. That is why > > > this > > > > > > change is considered backward incompatible. To be honest, given > the > > > > > nature > > > > > > of suggested changes, I don't see any valid use case on why a > user > > > may > > > > > have > > > > > > a value which will be invalid after the new constraints. > > > > > > > > > > > > > > > > > > -- > > > > > > Divij Vaidya > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Nov 19, 2024 at 2:21 AM Kirk True <k...@kirktrue.pro> > > wrote: > > > > > > > > > > > > > Hi Divij, > > > > > > > > > > > > > > Thanks for the KIP! > > > > > > > > > > > > > > My only question: > > > > > > > > > > > > > > KT1. In the case where we change the constraints so that a > user's > > > > > > > previously valid configuration is now invalid, do we do > anything > > > > other > > > > > > than > > > > > > > throw a ConfigException? > > > > > > > > > > > > > > Thanks, > > > > > > > Kirk > > > > > > > > > > > > > > On Mon, Nov 18, 2024, at 2:13 AM, Divij Vaidya wrote: > > > > > > > > Hey folks > > > > > > > > > > > > > > > > With 4.0, we have an opportunity to reset the default values > > and > > > > add > > > > > > > > constraints in the configurations based on our learnings > since > > > 3.0. > > > > > > > > > > > > > > > > Here's a KIP which modifies defaults for some properties and > > > > modifies > > > > > > the > > > > > > > > constraints for a few others. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations > > > > > > > > > > > > > > > > > > > > > > > > Looking forward for your feedback. > > > > > > > > > > > > > > > > (Previous discussion thread on this topic - > > > > > > > > > > https://lists.apache.org/thread/3dx9mdmsqf8pko9xdmhks80k96g650zp > > > ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >