>
> I will update the KIP on how we can optimize the placement of controller
> (pinning it to a preferred broker id (potentially config enabled) ) if that
> sounds reasonable.


The point I (and I think Jay too) was making is that pinning a controller
to a broker through config is what we should stay away from. This should be
handled by whatever tool you may be using to bounce the cluster in a
rolling restart fashion (where it detects the current controller and
restarts it at the very end).


On Tue, Oct 20, 2015 at 5:35 PM, Abhishek Nigam <ani...@linkedin.com.invalid
> wrote:

> Hi Jay/Neha,
> I just subscribed to the mailing list so I read your response but did not
> receive your email so adding the context into this email thread.
>
> "
>
> Agree with Jay on staying away from pinning roles to brokers. This is
> actually harder to operate and monitor.
>
> Regarding the problems you mentioned-
> 1. Reducing the controller moves during rolling bounce is useful but really
> something that should be handled by the tooling. The root cause is that
> currently the controller move is expensive. I think we'd be better off
> investing time and effort in thinning out the controller. Just moving to
> the batch write APIs in ZooKeeper will make a huge difference.
> 2. I'm not sure I understood the motivation behind moving partitions out of
> the controller broker. That seems like a proposal for a solution, but can
> you describe the problems you saw that affected controller functionality?
>
> Regarding the location of the controller, it seems there are 2 things you
> are suggesting:
> 1. Optimizing the strategy of picking a broker as the controller (e.g.
> least loaded node)
> 2. Moving the controller if a broker soft fails.
>
> I don't think #1 is worth the effort involved. The better way of addressing
> it is to make the controller thinner and faster. #2 is interesting since
> the problem is that while a broker fails, all state changes fail or are
> queued up which globally impacts the cluster. There are 2 alternatives -
> have a tool that allows you to move the controller or just kill the broker
> so the controller moves. I prefer the latter since it is simple and also
> because a misbehaving broker is better off shutdown anyway.
>
> Having said that, it will be helpful to know details of the problems you
> saw while operating the controller. I think understanding those will help
> guide the solution better.
>
> On Tue, Oct 20, 2015 at 12:49 PM, Jay Kreps <j...@confluent.io> wrote:
>
> > This seems like a step backwards--we really don't want people to manually
> > manage the location of the controller and try to manually balance
> > partitions off that broker.
> >
> > I think it might make sense to consider directly fixing the things you
> > actual want to fix:
> > 1. Two many controller moves--we could either just make this cheaper or
> > make the controller location more deterministic e.g. having the election
> > prefer the node with the smallest node id so there were fewer failovers
> in
> > rolling bounces.
> > 2. You seem to think having the controller on a normal node is a problem.
> > Can you elaborate on what the negative consequences you've observed?
> Let's
> > focus on fixing those.
> >
> > In general we've worked very hard to avoid having a bunch of dedicated
> > roles for different nodes and I would be very very loath to see us move
> > away from that philosophy. I have a fair amount of experience with both
> > homogenous systems that have a single role and also systems with many
> > differentiated roles and I really think that the differentiated approach
> > causes more problems than it solves for most deployments due to the added
> > complexity.
> >
> > I think we could also fix up this KIP a bit. For example it says there
> are
> > no public interfaces involved but surely there are new admin commands to
> > control the location? There are also some minor things like listing it as
> > released in 0.8.3 that seem wrong.
> >
> > -Jay
> >
> > On Tue, Oct 20, 2015 at 12:18 PM, Abhishek Nigam <
> > ani...@linkedin.com.invalid> wrote:
> >
> > > Hi,
> > > Can we please discuss this KIP. The background for this is that it
> allows
> > > us to pin controller to a broker. This is useful in a couple of
> > scenarios:
> > > a) If we want to do a rolling bounce we can reduce the number of
> > controller
> > > moves down to 1.
> > > b) Again pick a designated broker and reduce the number of partitions
> on
> > it
> > > through admin reassign partitions and designate it as a controller.
> > > c) Dynamically move controller if we see any problems on the broker
> which
> > > it is running.
> > >
> > > Here is the wiki page
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-39+Pinning+controller+to+broker
> > >
> > > -Abhishek
> > >
> >
>
> "
>
> I think based on the feedback we can limit the discussion to the rolling
> upgrade scenario and how best to address it. I think the only scenario
> which I have heard
> where we wanted to move controller off a broker was due to a bug where we
> had multiple controllers due to a bug which has since been fixed.
>
> I will update the KIP on how we can optimize the placement of controller
> (pinning it to a preferred broker id (potentially config enabled) ) if that
> sounds reasonable.
> Many of the ideas of the original KIP can still apply in the limited scope.
>
> -Abhishek
>



-- 
Thanks,
Neha

Reply via email to