Hi Abhishek - Perhaps it would help if you explained the motivation behind your proposal. I know there was a bunch of discussion on KAFKA-1778, can you summarize? Currently, I'd agree with Neha and Jay that there isn't really a strong reason to pin the controller to a given broker or restricted to a set of brokers.
For rolling upgrades, it should be simpler to bounce the existing controller last. As for choosing a relatively lightly loaded broker, I think we should ideally eliminate those by distributing partitions (and data rate) as evenly as possible. If for some reason a broker cannot become the controller, (by virtue of load or something else) arguably that is a separate problem that needs addressing. Thanks, Aditya On Tue, Oct 20, 2015 at 9:27 PM, Neha Narkhede <n...@confluent.io> wrote: > > > > I will update the KIP on how we can optimize the placement of controller > > (pinning it to a preferred broker id (potentially config enabled) ) if > that > > sounds reasonable. > > > The point I (and I think Jay too) was making is that pinning a controller > to a broker through config is what we should stay away from. This should be > handled by whatever tool you may be using to bounce the cluster in a > rolling restart fashion (where it detects the current controller and > restarts it at the very end). > > > On Tue, Oct 20, 2015 at 5:35 PM, Abhishek Nigam > <ani...@linkedin.com.invalid > > wrote: > > > Hi Jay/Neha, > > I just subscribed to the mailing list so I read your response but did not > > receive your email so adding the context into this email thread. > > > > " > > > > Agree with Jay on staying away from pinning roles to brokers. This is > > actually harder to operate and monitor. > > > > Regarding the problems you mentioned- > > 1. Reducing the controller moves during rolling bounce is useful but > really > > something that should be handled by the tooling. The root cause is that > > currently the controller move is expensive. I think we'd be better off > > investing time and effort in thinning out the controller. Just moving to > > the batch write APIs in ZooKeeper will make a huge difference. > > 2. I'm not sure I understood the motivation behind moving partitions out > of > > the controller broker. That seems like a proposal for a solution, but can > > you describe the problems you saw that affected controller functionality? > > > > Regarding the location of the controller, it seems there are 2 things you > > are suggesting: > > 1. Optimizing the strategy of picking a broker as the controller (e.g. > > least loaded node) > > 2. Moving the controller if a broker soft fails. > > > > I don't think #1 is worth the effort involved. The better way of > addressing > > it is to make the controller thinner and faster. #2 is interesting since > > the problem is that while a broker fails, all state changes fail or are > > queued up which globally impacts the cluster. There are 2 alternatives - > > have a tool that allows you to move the controller or just kill the > broker > > so the controller moves. I prefer the latter since it is simple and also > > because a misbehaving broker is better off shutdown anyway. > > > > Having said that, it will be helpful to know details of the problems you > > saw while operating the controller. I think understanding those will help > > guide the solution better. > > > > On Tue, Oct 20, 2015 at 12:49 PM, Jay Kreps <j...@confluent.io> wrote: > > > > > This seems like a step backwards--we really don't want people to > manually > > > manage the location of the controller and try to manually balance > > > partitions off that broker. > > > > > > I think it might make sense to consider directly fixing the things you > > > actual want to fix: > > > 1. Two many controller moves--we could either just make this cheaper or > > > make the controller location more deterministic e.g. having the > election > > > prefer the node with the smallest node id so there were fewer failovers > > in > > > rolling bounces. > > > 2. You seem to think having the controller on a normal node is a > problem. > > > Can you elaborate on what the negative consequences you've observed? > > Let's > > > focus on fixing those. > > > > > > In general we've worked very hard to avoid having a bunch of dedicated > > > roles for different nodes and I would be very very loath to see us move > > > away from that philosophy. I have a fair amount of experience with both > > > homogenous systems that have a single role and also systems with many > > > differentiated roles and I really think that the differentiated > approach > > > causes more problems than it solves for most deployments due to the > added > > > complexity. > > > > > > I think we could also fix up this KIP a bit. For example it says there > > are > > > no public interfaces involved but surely there are new admin commands > to > > > control the location? There are also some minor things like listing it > as > > > released in 0.8.3 that seem wrong. > > > > > > -Jay > > > > > > On Tue, Oct 20, 2015 at 12:18 PM, Abhishek Nigam < > > > ani...@linkedin.com.invalid> wrote: > > > > > > > Hi, > > > > Can we please discuss this KIP. The background for this is that it > > allows > > > > us to pin controller to a broker. This is useful in a couple of > > > scenarios: > > > > a) If we want to do a rolling bounce we can reduce the number of > > > controller > > > > moves down to 1. > > > > b) Again pick a designated broker and reduce the number of partitions > > on > > > it > > > > through admin reassign partitions and designate it as a controller. > > > > c) Dynamically move controller if we see any problems on the broker > > which > > > > it is running. > > > > > > > > Here is the wiki page > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-39+Pinning+controller+to+broker > > > > > > > > -Abhishek > > > > > > > > > > > " > > > > I think based on the feedback we can limit the discussion to the rolling > > upgrade scenario and how best to address it. I think the only scenario > > which I have heard > > where we wanted to move controller off a broker was due to a bug where we > > had multiple controllers due to a bug which has since been fixed. > > > > I will update the KIP on how we can optimize the placement of controller > > (pinning it to a preferred broker id (potentially config enabled) ) if > that > > sounds reasonable. > > Many of the ideas of the original KIP can still apply in the limited > scope. > > > > -Abhishek > > > > > > -- > Thanks, > Neha >