Thanks Colin, that helps. Can we add some of this to the KIP? Ryanne
On Fri, Aug 2, 2019 at 12:23 PM Colin McCabe <cmcc...@apache.org> wrote: > On Fri, Aug 2, 2019, at 07:50, Ryanne Dolan wrote: > > Thanks Colin, interesting KIP. > > > > I'm concerned that the KIP does not actually address its stated > > motivations. In particular, "Simpler Deployment and Configuration" are > not > > really achieved, given that: 1) the proposal still requires quorums (now > of > > controllers, instead of ZK nodes), with the same restrictions as ZK, e.g. > > at least three controllers and only an odd number of controllers, neither > > of which is easy to manage; 2) the proposal still requires separate > > processes with separate configuration (and indeed, no less configuration > > than ZK requires, namely a port to listen on); 3) configuration of > brokers > > is not simplified, as they still require a list of servers to contact > (now > > coordinators instead of ZK nodes). Is there any improvement to > > configuration and deployment I'm overlooking? > > Hi Ryanne, > > Thanks for taking a look. > > The difficulty in configuring and deploying ZooKeeper is not really in > configuring a port number, or even really in running a second JVM. If that > were the main difficulty, then running ZK would definitely be pretty simple. > > The difficulty is that ZooKeeper is an entirely separate distributed > system with entirely separate configuration for things like security, > network setup, data directories, etc. You also have separate systems for > management, metrics, and so on. Learning how to configure security or > metrics in Kafka doesn't really help you with setting up the corresponding > features in ZK. You have to start from scratch. That is what we are > trying to avoid here. > > > Second, single-broker clusters are mentioned as a motivation, but it is > > unclear how this change would move in that direction. Seems Raft requires > > three nodes, so perhaps the minimum number of hosts would be three? > > Just like with ZooKeeper, you can run Raft on a single node. Needless to > say, you don't have any tolerance against single-node failures when running > with a single node. > > > > > Third, "discrepancies between the controller state and the zookeeper > state" > > are mentioned as a problem, and I understand that controllers coordinate > > amongst themselves rather than via zookeeper, but I'm not sure there is a > > functional difference? It seems controllers can still disagree amongst > > themselves for periods of time, with the same consequences as disagreeing > > with ZK. > > Members of a Raft quorum cannot disagree with each other. This is similar > to how ZooKeeper's "ZAB" protocol works. There's more information in the > Raft paper: https://raft.github.io/raft.pdf > > > > > Finally, you say "there is no generic way for the controller to follow > the > > ZooKeeper event log." I'm unsure this is a problem. Having a log is > > certainly powerful for consumers, but how would a controller use this log > > to do anything it can't without it? It seems only the latest compacted > > state is ever used, and there is nothing to undo or replay from the log. > > What future capabilities are you envisioning we would gain from carrying > > around log history? > > There are many advantages to treating metadata as a log. Because the > controllers will now all track the latest state, controller failover will > not require a lengthy reloading period where we transfer all the state to > the new controller. Because we always send deltas over the wire and not > full states, brokers can catch up with the latest state faster, and use > less bandwidth to do so. It will even be possible for the brokers to cache > this state locally in a file on disk, so that broker startup can be much > faster. All of these are important to scaling Kafka in the future. > Treating metadata as a log avoids a lot of the complex failure corner > cases we have seen where a broker misses a single update sent from the > controller, but gets subsequent updates. > > best, > Colin > > > > > > Ryanne > > > > > > On Thu, Aug 1, 2019, 4:05 PM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi all, > > > > > > I've written a KIP about removing ZooKeeper from Kafka. Please take a > > > look and let me know what you think: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum > > > > > > cheers, > > > Colin > > > > > >