Hi Mickael,

It's a nice proposal. It's appealing to have a pluggable way to override
default kafka placement decisions, and the motivation section lists some of
them. Here are few comments:

* The motivation section has "When adding brokers to a cluster, Kafka
currently does not necessarily place new partitions on new brokers". I am
not sure how valuable doing this will be. A newly created kafka topic takes
time to reach the same usage level as existing topics, say because the
topic created by a new workload that is getting onboarded, or the expansion
was done to relieve disk pressure on existing nodes etc. While new topics
catch up to existing workload, the new brokers are not sharing equal load
in the cluster, which probably defeats the purpose of adding new brokers.
In addition to that clustering new topics like this on new brokers have
implications from fault domain perspective. A reasonable way to approach it
is to indeed use CruiseControl to move things around so that the newly
added nodes become immediately involved and share cluster load.
* Regarding "When administrators want to remove brokers from a cluster,
there is no way to prevent Kafka from placing partitions on them", this is
indeed an issue. I would argue that this is needed by everyone and should
be part of Kafka, instead of being implemented as part of a plugin
interface by multiple teams.
* For "When some brokers are near their storage/throughput limit, Kafka
could avoid putting new partitions on them", while this can help relieve
short term overload I think again the correct solution here is something
like CruiseControl where the system is monitored and things moved around to
maintain a balanced cluster. A new topic will not take any disk space, so
placing them anywhere normally isn't going to add to the storage overload.
Similar to the previous case, maybe a mechanism in Kafka to put nodes in a
quarantine state is a better way to approach this.

In terms of the proposed api, I have a couple of comments:

* It is not clear if the proposal applies to partitions of new topics or
addition on partitions to an existing topic. Explicitly stating that will
be helpful.
* Regarding part "To address the use cases identified in the motivation
section, some knowledge about the current state of the cluster is
necessary. Details whether a new broker has just been added or is being
decommissioned are not part of the cluster metadata. Therefore such
knowledge has to be provided via an external means to the ReplicaPlacer,
for example via the configuration". It's not clear how this will be done.
If I have to implement this interface, it will be helpful to have clear
guidance/examples here which hopefully ties to the use cases in the
motivation section. It also allows us to figure out if the proposed
interface is complete and helps future implementers of the interface.

Couple of minor comments:
* The KIP is not listed in the main KIP page (
https://cwiki-test.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals).
Can you please add it there.
* The page has "This is especially true for the 4 scenarios listed in the
Motivation section", but there are only 3 scenarios listed.

Regards,
Vikas


On Tue, May 3, 2022 at 5:51 PM Colin McCabe <cmcc...@apache.org> wrote:

> Hi Mickael,
>
> We did discuss this earlier, and I remember not being too enthusiastic
> about a pluggable policy here :)
>
> There have been several changes to the placement code in the last few
> weeks. (These are examples of the kind of changes that are impossible to do
> once an API is established, by the way.) Can you please revise the KIP to
> take these into account?
>
> I'd also like to understand a little bit better why we need this API when
> we have the explicit placement API for createTopics and createPartitions.
> Can you give me a few scenarios where the manual placement API would be
> insufficient?
>
> best,
> Colin
>
>
> On Mon, May 2, 2022, at 09:28, Mickael Maison wrote:
> > Hi,
> >
> > If there are no further comments, I'll start a vote in the next few days.
> >
> > Thanks,
> > Mickael
> >
> > On Wed, Mar 30, 2022 at 3:51 AM Luke Chen <show...@gmail.com> wrote:
> >>
> >> Hi Mickael,
> >>
> >> Thanks for the update.
> >> It answered my questions!
> >>
> >> Thank you.
> >> Luke
> >>
> >> On Wed, Mar 30, 2022 at 12:09 AM Mickael Maison <
> mickael.mai...@gmail.com>
> >> wrote:
> >>
> >> > Hi Luke,
> >> >
> >> > Thanks for the feedback.
> >> >
> >> > 1. Thanks, fixed!
> >> > 2. Yes that's right. It's the same behavior for topic policies
> >> > 3. I've added details about how the mentioned scenarios could be
> >> > addressed. The information required to make such decisions is not part
> >> > of the Kafka cluster metadata so an external input is necessary. This
> >> > KIP does not propose a specific mechanism for doing it.
> >> >
> >> > I hope this answers your questions.
> >> >
> >> > Thanks,
> >> > Mickael
> >> >
> >> >
> >> > On Tue, Mar 29, 2022 at 5:42 PM Mickael Maison <
> mickael.mai...@gmail.com>
> >> > wrote:
> >> > >
> >> > > Hi Ryanne,
> >> > >
> >> > > That's a good point!
> >> > >
> >> > > There's no value in having all implementations perform the same
> sanity
> >> > > checks. If the replication factor is < 1 or larger than the current
> >> > > number of registered brokers, the controller should directly throw
> >> > > InvalidReplicationFactorException and not call the ReplicaPlacer.
> I've
> >> > > updated the KIP so the place() method now only throws
> >> > > ReplicaPlacementException.
> >> > >
> >> > > Thanks,
> >> > > Mickael
> >> > >
> >> > >
> >> > >
> >> > > On Mon, Mar 28, 2022 at 6:20 PM Ryanne Dolan <ryannedo...@gmail.com
> >
> >> > wrote:
> >> > > >
> >> > > > Wondering about InvalidReplicationFactorException. Why would an
> >> > > > implementation throw this? Given the information passed to the
> method,
> >> > > > seems like this could only be thrown if there were obviously
> invalid
> >> > > > arguments, like a negative number or zero. Can we just guarantee
> such
> >> > > > invalid arguments aren't passed in?
> >> > > >
> >> > > > Ryanne
> >> > > >
> >> > > > On Sat, Mar 26, 2022, 8:51 AM Luke Chen <show...@gmail.com>
> wrote:
> >> > > >
> >> > > > > Hi Mickael,
> >> > > > >
> >> > > > > Thanks for the KIP!
> >> > > > > It's indeed a pain point for the Kafka admins.
> >> > > > >
> >> > > > > I have some comments:
> >> > > > > 1. Typo in motivation section: When administrators [when to]
> remove
> >> > brokers
> >> > > > > from a cluster,....
> >> > > > > 2. If different `replica.placer.class.name` configs are set in
> all
> >> > > > > controllers, I think only the config for  "active controller"
> will
> >> > take
> >> > > > > effect, right?
> >> > > > > 3. Could you explain more about how the proposal fixes some
> >> > scenarios you
> >> > > > > listed, ex: the new added broker case. How could we know the
> broker
> >> > is new
> >> > > > > added? I guess it's by checking the broker load via some metrics
> >> > > > > dynamically, right?
> >> > > > >
> >> > > > >
> >> > > > > Thank you.
> >> > > > > Luke
> >> > > > >
> >> > > > > On Fri, Mar 18, 2022 at 10:30 AM Ryanne Dolan <
> ryannedo...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Thanks Mickael, this makes sense to me! I've been wanting
> >> > something like
> >> > > > > > this in order to decommission a broker without new partitions
> >> > getting
> >> > > > > > accidentally assigned to it.
> >> > > > > >
> >> > > > > > Ryanne
> >> > > > > >
> >> > > > > > On Thu, Mar 17, 2022, 5:56 AM Mickael Maison <
> >> > mickael.mai...@gmail.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi,
> >> > > > > > >
> >> > > > > > > I'd like to start a new discussion on KIP-660. I originally
> >> > wrote this
> >> > > > > > > KIP in 2020 and the initial discussion
> >> > > > > > > (
> >> > https://lists.apache.org/thread/xn7xyb74nyt281brto4x28r9rzxm4lp9)
> >> > > > > > > raised some concerns especially around KRaft (which did not
> >> > exist at
> >> > > > > > > that time) and scalability.
> >> > > > > > >
> >> > > > > > > Since then, we got a new KRaft controller so I've been able
> to
> >> > revisit
> >> > > > > > > this KIP. I kept the KIP number as it's essentially the same
> >> > idea, but
> >> > > > > > > the proposal is significantly different:
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-660%3A+Pluggable+ReplicaPlacer
> >> > > > > > >
> >> > > > > > > Please take a look and let me know if you have any feedback.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > Mickael
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> >
>

Reply via email to