Hi Divij, We would like to change the default value for the below remote log configurations:
1. remote.log.manager.copier.thread.pool.size -- default value from -1 to 10 2. remote.log.manager.expiration.thread.pool.size -- default value from -1 to 10 In KIP-950 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement#KIP950:TieredStorageDisablement-Disablement-ZookeeperBackedCluster>, it was proposed to split the thread pool that was used by leader and follower into copy and expiration thread pools. With KIP-956 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas>, we did this change in v3.9. The default value for the copier and expiration thread pools was set to -1 to derive the value from the previous thread pool for backward compatibility. See PR#17499 <https://github.com/apache/kafka/pull/17499/#discussion_r1804119567>. Now, we have three thread pools: 1. remote.log.manager.thread.pool.size -- this is used by replica to fetch the highest-uploaded remote offset for the follower partitions and won't be deprecated as proposed. 2. remote.log.manager.copier.thread.pool.size -- this is used by leader to upload the log segments to remote. 3. remote.log.manager.expiration.thread.pool.size -- this is used by leader to delete the expired log segments from remote. A broker acts as both the leader for some partitions and as follower for other partitions, deriving the value from `remote.log.manager.thread.pool.size` for copier and expiration thread pool looks incorrect to me. The task of the follower is light-weight to fetch only the highest-uploaded remote offset whereas the task of the leader is heavy to copy and delete the segments. Moreover, the change didn't ensure backward compatibility, if the user was running with 10 threads in v3.8. In v3.9, the server will start 30 threads. So, I propose to change the default value of copier and expiration thread pools from -1 to 10. This will also ensure smooth validation while making those configs as dynamic. Thanks, Kamal On Wed, Nov 20, 2024 at 5:41 PM Divij Vaidya <divijvaidy...@gmail.com> wrote: > Thank you everyone for your feedback. > > The new revision of KIP is ready for review. Notable changes since last > revision: > 1. Removed change to log.message.timestamp.before.max.ms > 2. Modified default for num.recovery.threads.per.data.dir > 3. Added new proposals for num.io.threads and num.network.threads > 4. Improved justification for changes and made motivation more explicit > > > *Federico* > > Good point. I have updated the compatibility section to include information > on how the new values will be handled during the rolling upgrade to 4.0. > > *Jun* > > Thank you for pointing out the use case. It correlated with my experience > where most of the problems arise from users accidentally setting a > timestamp in ns instead of ms, and hence, leading to a future time. I have > reverted changes to log.message.timestamp.after.max.ms > > *Tommi* > > You have a valid point about the friction this could cause during upgrade. > This is a risk we are accepting with this KIP. The risk is acceptable > because we expect due diligence from the users while transitioning to a new > major version and also because, there aren't legitimate use cases to set > these values to a threshold where they will cause system instability. > > I have added the failure mode for invalid values in the compatibility > section. > > About your suggestion to apply new changes but not fail upgrade, I am a bit > wary about silently modifying configuration that user has explicitly set. > The strategy also won't work for cases where the user uses a 3P tool to > reset configurations. Failing fast on invalid configuration is the best > option. To improve user experience I have added a suggestion in the KIP to > add a "upgrade validation tool" which a user can run to verify > compatibility before actual upgrade. It won't be part of this KIP but > something we might want to do to ease migration to 4.0. I will float the > idea around in the community and see if we have volunteers for this. > > *Greg* > > Thanks for raising this concern. We indeed intentionally use small values > to "force" a segment roll sometimes. However, I believe that there are > alternate ways to accomplish that in a test environment. As an example, if > I wanted to create small segments for a test scenario, I would either write > an integration test where I can call roll explicitly OR send a message with > create time in future (roll is based on timestamp of last appended message) > OR just write more data (1MB minimum today). > > But to your larger point, I agree that there is value in providing two > different guardrails for the users. First, a "user-friendly" guardrails > which will be an opinionated guardrail meant to protect the cluster from > inadvertent sub-optimal and incorrect configuration. Users who do not wish > to spend time in learning Kafka intricacies and the effect of myriad > configurations can choose this option. Second, a "advanced user" guardrail > which will allow users to push the cluster to it's boundaries with full > knowledge of the repercussions and fallout that may ensue. > > As a project, which is trusted by users across this broad spectrum of > expertise, I do believe that we need to provide user-friendly mechanisms > for different segments for users, but that is something I am unwilling to > take up in this KIP. If you feel passionate about addressing this feature > gap in Apache Kafka, please feel free to start a discussion on this topic > and I will promise to be an active participant in it. The outcome could be > as simple as a documentation where we provide ideal configuration for > different workloads. > > *Luke* > > I agree with your suggestion to set the number of threads based on machine > cores. In fact, I would promote a similar solution for network threads and > request handler threads as well (as an example, I have seen users setting > large values for network threads compared to request handler threads, > resulting in accepting more requests that brokers can handle, leading to > degradation. A constraint that network threads <= request handler threads > could prevent such scenario). > > Earlier, I didn't add these since these are not dead-obvious changes > compared to the rest of the changes in the KIP, but I have now added them. > It's worth discussing them. > > -- > Divij Vaidya > > > > On Wed, Nov 20, 2024 at 11:01 AM Luke Chen <show...@gmail.com> wrote: > > > Hi Divij, > > > > One comment: > > For the default value of "num.recovery.threads.per.data.dir", we change > > from 1 to 2. > > Could we change the value based on the CPU core number? > > Ex: default value = CPU core num / num of log dirs. > > > > WDYT? > > > > Luke > > > > > > On Wed, Nov 20, 2024 at 5:53 PM Tommi Vainikainen > > <tvain...@aiven.io.invalid> > > wrote: > > > > > Hi, > > > > > > This proposal looks good to me with new defaults and constraints. > > > > > > About breaking backward compatibility I've question on how you see it > > best > > > handled in case someone upgrades Kafka cluster from 3.9 to 4.0 with > let's > > > say one topic having segment.bytes=500 kb (new min is 1 MB). Does it > mean > > > that the Kafka process fails to start with the ConfigException, or when > > is > > > the ConfigException thrown? Would it be better to simply start behaving > > > like it is at least a new minimum regardless of topic setting, and then > > on > > > the fly fix the config instead? That would be much more convenient for > > most > > > Kafka users in my opinion. > > > > > > On Tue, Nov 19, 2024 at 12:55 PM Divij Vaidya <divijvaidy...@gmail.com > > > > > wrote: > > > > > > > KT1 - That is right. We will throw a ConfigException. That is why > this > > > > change is considered backward incompatible. To be honest, given the > > > nature > > > > of suggested changes, I don't see any valid use case on why a user > may > > > have > > > > a value which will be invalid after the new constraints. > > > > > > > > > > > > -- > > > > Divij Vaidya > > > > > > > > > > > > > > > > On Tue, Nov 19, 2024 at 2:21 AM Kirk True <k...@kirktrue.pro> wrote: > > > > > > > > > Hi Divij, > > > > > > > > > > Thanks for the KIP! > > > > > > > > > > My only question: > > > > > > > > > > KT1. In the case where we change the constraints so that a user's > > > > > previously valid configuration is now invalid, do we do anything > > other > > > > than > > > > > throw a ConfigException? > > > > > > > > > > Thanks, > > > > > Kirk > > > > > > > > > > On Mon, Nov 18, 2024, at 2:13 AM, Divij Vaidya wrote: > > > > > > Hey folks > > > > > > > > > > > > With 4.0, we have an opportunity to reset the default values and > > add > > > > > > constraints in the configurations based on our learnings since > 3.0. > > > > > > > > > > > > Here's a KIP which modifies defaults for some properties and > > modifies > > > > the > > > > > > constraints for a few others. > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations > > > > > > > > > > > > > > > > > > Looking forward for your feedback. > > > > > > > > > > > > (Previous discussion thread on this topic - > > > > > > https://lists.apache.org/thread/3dx9mdmsqf8pko9xdmhks80k96g650zp > ) > > > > > > > > > > > > > > > > > > > > >