Jun

Do you have any final thoughts here? The vote thread has sufficient votes
to accept but I wanted to ensure that we don’t have any pending items to
discuss here.

—
Divij Vaidya



On Thu 28. Nov 2024 at 14:53, Divij Vaidya <divijvaidy...@gmail.com> wrote:

> Agreed about the validation tool. I have already added it in the
> "compatibility" section of the KIP. Will create a separate JIRA for it as
> soon as KIP is approved.
>
> --
> Divij Vaidya
>
>
>
> On Thu, Nov 28, 2024 at 5:35 AM Satish Duggana <satish.dugg...@gmail.com>
> wrote:
>
>> Hi Kamal, Thanks for updating the KIP with the offline discussion we
>> had, especially on remotelog.manager.* thread pools with the new
>> behaviour.
>>
>> Hi Divij, We should target having a validation tool in 4.0 so that
>> users can run it before they run their upgrade and take necessary
>> actions on it. I am fine with updating the KIP with those details
>> later when PRs are raised.
>>
>> Thanks,
>> Satish.
>>
>> On Thu, 28 Nov 2024 at 00:31, Divij Vaidya <divijvaidy...@gmail.com>
>> wrote:
>> >
>> > Yes, Kamal, we will fix the tests that rely on low values of segment.ms
>> as
>> > part of the KIP implementation.
>> >
>> > --
>> > Divij Vaidya
>> >
>> >
>> >
>> > On Wed, Nov 27, 2024 at 7:40 PM Jun Rao <j...@confluent.io.invalid>
>> wrote:
>> >
>> > > Hi, Kamal,
>> > >
>> > > Thanks for the explanation. This change sounds good to me then.
>> > >
>> > > Jun
>> > >
>> > > On Wed, Nov 27, 2024 at 10:31 AM Kamal Chandraprakash <
>> > > kamal.chandraprak...@gmail.com> wrote:
>> > >
>> > > > Hi Jun,
>> > > >
>> > > > Thanks for the review! I've made the addendum to this KIP about the
>> > > remote
>> > > > storage configs:
>> > > >
>> > > > > 13. Why are the defaults for
>> remote.log.manager.copier.thread.pool.size
>> > > > and
>> > > > remote.log.manager.thread.pool.size different?
>> > > >
>> > > > We have 3 thread pools in remote log manager:
>> > > >
>> > > > a. remote.log.manager.thread.pool.size: handles the RLMFollowerTask
>> > > > <
>> > > >
>> > >
>> https://sourcegraph.com/github.com/apache/kafka/-/blob/core/src/main/java/kafka/log/remote/RemoteLogManager.java?L1454
>> > > > >
>> > > > to read the highest-uploaded remote offset for follower partitions
>> > > > b. remote.log.manager.copier.thread.pool.size: handles the
>> RLMCopyTask
>> > > > <
>> > > >
>> > >
>> https://sourcegraph.com/github.com/apache/kafka/-/blob/core/src/main/java/kafka/log/remote/RemoteLogManager.java?L830
>> > > > >
>> > > > to copy the segments to remote storage
>> > > > c. remote.log.manager.expiration.thread.pool.size: handles the
>> > > > RLMExpirationTask
>> > > > <
>> > > >
>> > >
>> https://sourcegraph.com/github.com/apache/kafka/-/blob/core/src/main/java/kafka/log/remote/RemoteLogManager.java?L1095
>> > > > >
>> > > > to delete the expired remote segments.
>> > > >
>> > > > The plan was to deprecate the remote.log.manager.thread.pool.size
>> > > > thread-pool after KIP-950 but it was not done
>> > > > and used for the follower partition tasks. Compared to
>> copier/expiration
>> > > > tasks, the follower tasks are light-weight
>> > > > so proposing to reduce that thread-pool size to 2.
>> > > >
>> > > > --
>> > > > Kamal
>> > > >
>> > > > On Wed, Nov 27, 2024 at 11:48 PM Kamal Chandraprakash <
>> > > > kamal.chandraprak...@gmail.com> wrote:
>> > > >
>> > > > > Hi Divij,
>> > > > >
>> > > > > I also share the same concern as Greg pointed out for
>> segment.bytes /
>> > > > > segment.index.bytes.
>> > > > > All the tiered storage tests
>> > > > > <
>> > > >
>> > >
>> https://sourcegraph.com/github.com/apache/kafka/-/blob/storage/src/test/java/org/apache/kafka/tiered/storage/utils/TieredStorageTestUtils.java?L174
>> > > > >
>> > > > > might start to fail with the new defaults. I'm fine with the
>> proposal
>> > > > > provided we plan to fix those tests as part of this KIP.
>> > > > >
>> > > > > --
>> > > > > Kamal
>> > > > >
>> > > > >
>> > > > > On Wed, Nov 27, 2024 at 9:55 PM Divij Vaidya <
>> divijvaidy...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > >> Hello everyone
>> > > > >>
>> > > > >> Since, I believe I have addressed all the concerns that were
>> raised
>> > > > here,
>> > > > >> I
>> > > > >> have started a vote thread for this KIP at
>> > > > >> https://lists.apache.org/thread/39dmfkd7ktb0oo44yqrkndcn4kcqt5hc
>> > > > >>
>> > > > >> Please participate in the vote.
>> > > > >>
>> > > > >> --
>> > > > >> Divij Vaidya
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> On Wed, Nov 27, 2024 at 12:33 PM Divij Vaidya <
>> > > divijvaidy...@gmail.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Thanks for the feedback folks.
>> > > > >> >
>> > > > >> > *Jun*
>> > > > >> >
>> > > > >> > 10. Fixed it. It now says 2 as the new default for recovery
>> threads.
>> > > > >> >
>> > > > >> > 11. I have added a sentence that we will apply the defaults
>> for both
>> > > > >> > broker level and equivalent topic level configs. I have further
>> > > added
>> > > > >> both
>> > > > >> > the broker level and the topic level config to the table. For
>> > > example,
>> > > > >> you
>> > > > >> > may notice (message.timestamp.after.max.ms /
>> > > > >> > log.message.timestamp.after.max.ms). Furthermore, the
>> constraints
>> > > > will
>> > > > >> > apply (similar to constraints today) when validating
>> dynamically
>> > > > changed
>> > > > >> > configuration and also when validating static configuration
>> (such as
>> > > > >> > server.properties). Please let me know if I have missed
>> anything.
>> > > > >> >
>> > > > >> > 12. In the interest of time, I have removed the constraint
>> proposal
>> > > > for
>> > > > >> rf
>> > > > >> > >= min.insync.replicas. We will circle back on it in a
>> separate KIP.
>> > > > >> >
>> > > > >> > *Luke*
>> > > > >> >
>> > > > >> > 10 and 12 above should align with what you suggested.
>> > > > >> >
>> > > > >> > --
>> > > > >> > Divij Vaidya
>> > > > >> >
>> > > > >> >
>> > > > >> >
>> > > > >> > On Wed, Nov 27, 2024 at 4:00 AM Luke Chen <show...@gmail.com>
>> > > wrote:
>> > > > >> >
>> > > > >> >> Hi Divij and Jun,
>> > > > >> >>
>> > > > >> >> Thanks for your comments.
>> > > > >> >> I'm good we put the default value of
>> > > > num.recovery.threads.per.data.dir
>> > > > >> to
>> > > > >> >> 2
>> > > > >> >> since there are many factors that need to be considered.
>> > > > >> >>
>> > > > >> >> And James, good point of min.insync.replicas validation. If
>> it's
>> > > > >> >> complicated or will confuse users, I'd propose we leave it
>> out of
>> > > > >> v4.0.0.
>> > > > >> >>
>> > > > >> >> Thanks.
>> > > > >> >> Luke
>> > > > >> >>
>> > > > >> >> On Tue, Nov 26, 2024 at 2:31 AM Jun Rao
>> <j...@confluent.io.invalid>
>> > > > >> wrote:
>> > > > >> >>
>> > > > >> >> > Hi, Divij,
>> > > > >> >> >
>> > > > >> >> > Thanks for the reply. A few more comments.
>> > > > >> >> >
>> > > > >> >> > 10. num.recovery.threads.per.data.dir still seems to depend
>> on
>> > > the
>> > > > >> >> number
>> > > > >> >> > of cores.
>> > > > >> >> >
>> > > > >> >> > 11. Some of the configs on the server side exist at
>> different
>> > > > levels
>> > > > >> >> > (static, broker, topic, etc) with slightly different names.
>> It
>> > > > would
>> > > > >> be
>> > > > >> >> > useful to be clear at what level the new default and the
>> > > constraint
>> > > > >> >> apply.
>> > > > >> >> >
>> > > > >> >> > 12. James had a good point on min.insync.replicas. It would
>> be
>> > > > >> useful to
>> > > > >> >> > define when the constraint applies (topic creation, config
>> > > changes,
>> > > > >> >> etc).
>> > > > >> >> > For example, if the broker-level min.insync.replicas value
>> is
>> > > > changed
>> > > > >> >> to 2,
>> > > > >> >> > what happens to existing topics with replication factor 1?
>> If a
>> > > > topic
>> > > > >> >> has
>> > > > >> >> > min.insync.replicas of 2, what happens to an
>> > > > >> AlterPartitionReassignments
>> > > > >> >> > request that wants to reduce the replication factor to 1?
>> > > > >> >> >
>> > > > >> >> > Thanks,
>> > > > >> >> >
>> > > > >> >> > Jun
>> > > > >> >> >
>> > > > >> >> > On Sat, Nov 23, 2024 at 9:57 AM Divij Vaidya <
>> > > > >> divijvaidy...@gmail.com>
>> > > > >> >> > wrote:
>> > > > >> >> >
>> > > > >> >> > > Jun
>> > > > >> >> > >
>> > > > >> >> > > Thank you for the feedback. I have removed the
>> configuration
>> > > > >> changes
>> > > > >> >> > where
>> > > > >> >> > > we are relying on num cores. The only change I have kept
>> is
>> > > > >> increasing
>> > > > >> >> > > recovery threads to 2 (from 1 as default).
>> > > > >> >> > >
>> > > > >> >> > > James
>> > > > >> >> > >
>> > > > >> >> > > Thank you for bringing the JIRA to my attention. I haven't
>> > > looked
>> > > > >> >> deeply
>> > > > >> >> > > into the implementation but based on my understanding of
>> the
>> > > > Kafka
>> > > > >> >> code
>> > > > >> >> > > base, I do believe that there is a path to implement this
>> > > > >> constraint.
>> > > > >> >> We
>> > > > >> >> > > will cross that bridge during the implementation phase
>> and I
>> > > will
>> > > > >> >> ensure
>> > > > >> >> > > that I look at the historical context you provided in the
>> JIRA.
>> > > > >> >> > >
>> > > > >> >> > > --
>> > > > >> >> > > Divij Vaidya
>> > > > >> >> > >
>> > > > >> >> > >
>> > > > >> >> > >
>> > > > >> >> > > On Sat, Nov 23, 2024 at 6:48 AM James Cheng <
>> > > > wushuja...@gmail.com>
>> > > > >> >> > wrote:
>> > > > >> >> > >
>> > > > >> >> > > > About replication.factor >= min.insync.replicas change,
>> you
>> > > > >> should
>> > > > >> >> look
>> > > > >> >> > > at
>> > > > >> >> > > > https://issues.apache.org/jira/browse/KAFKA-4680 .
>> That JIRA
>> > > > >> talks
>> > > > >> >> > about
>> > > > >> >> > > > some of the complexities of detecting it. For example,
>> what
>> > > if
>> > > > a
>> > > > >> >> topic
>> > > > >> >> > > has
>> > > > >> >> > > > replication factor 1, but someone changes the
>> broker-level
>> > > > >> >> > > > min.insync.replicas value to 2? How would that be
>> detected?
>> > > > >> >> > > >
>> > > > >> >> > > > That JIRA has an associated PR. The PR has some
>> comments that
>> > > > >> link
>> > > > >> >> to
>> > > > >> >> > > > discussions on this mailing list.
>> > > > >> >> > > >
>> > > > >> >> > > > That PR, btw, was just closed due to being stale.
>> > > > >> >> > > >
>> > > > >> >> > > > -James
>> > > > >> >> > > >
>> > > > >> >> > > > Sent from my iPhone
>> > > > >> >> > > >
>> > > > >> >> > > > > On Nov 18, 2024, at 2:15 AM, Divij Vaidya <
>> > > > >> >> divijvaidy...@gmail.com>
>> > > > >> >> > > > wrote:
>> > > > >> >> > > > >
>> > > > >> >> > > > > Hey folks
>> > > > >> >> > > > >
>> > > > >> >> > > > > With 4.0, we have an opportunity to reset the default
>> > > values
>> > > > >> and
>> > > > >> >> add
>> > > > >> >> > > > > constraints in the configurations based on our
>> learnings
>> > > > since
>> > > > >> >> 3.0.
>> > > > >> >> > > > >
>> > > > >> >> > > > > Here's a KIP which modifies defaults for some
>> properties
>> > > and
>> > > > >> >> modifies
>> > > > >> >> > > the
>> > > > >> >> > > > > constraints for a few others.
>> > > > >> >> > > > >
>> > > > >> >> > > > >
>> > > > >> >> > > >
>> > > > >> >> > >
>> > > > >> >> >
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations
>> > > > >> >> > > > >
>> > > > >> >> > > > >
>> > > > >> >> > > > > Looking forward for your feedback.
>> > > > >> >> > > > >
>> > > > >> >> > > > > (Previous discussion thread on this topic -
>> > > > >> >> > > > >
>> > > > >> https://lists.apache.org/thread/3dx9mdmsqf8pko9xdmhks80k96g650zp
>> > > > >> >> )
>> > > > >> >> > > >
>> > > > >> >> > >
>> > > > >> >> >
>> > > > >> >>
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>>
>

Reply via email to