Thanks for driving this KIP, Colin. The KIP is really well written. This is so exciting!
+1 (binding) Best, David On Wed, Dec 16, 2020 at 11:51 PM Colin McCabe <cmcc...@apache.org> wrote: > On Wed, Dec 16, 2020, at 13:08, Ismael Juma wrote: > > Thanks for all the work on the KIP. Given the magnitude of the KIP, I > > expect that some tweaks will be made as the code is implemented, reviewed > > and tested. I'm overall +1 (binding). > > > > Thanks, Ismael. > > > A few comments below: > > 1. It's a bit weird for kafka-storage to output a random uuid. Would it > be > > better to have a dedicated command for that? > > I'm not sure. The nice thing about putting it in kafka-storage.sh is that > it's there when you need it. I also think that having subcommands, like we > do here, really reduces the "clutter" that we have in some other > command-line tools. When you get help about the "info" subcommand, you > don't see flags for any other subcommand, for example. I guess we can move > this later if it seems more intuitive though. > > > Also, since we use base64 > > encoded uuids nearly everywhere (including cluster and topic ids), it > would > > be good to follow that pattern instead of the less compact > > "51380268-1036-410d-a8fc-fb3b55f48033". > > Good idea. I have updated this to use base64 encoded UUIDs. > > > 2. This is a nit, but I think it would be better to talk about built-in > > quorum mode instead of KIP-500 mode. It's more self descriptive than a > KIP > > reference. > > I do like the sound of "quorum mode." I guess the main question is, if we > later implement raft quorums for regular topics, would that nomenclature be > confusing? I guess we could talk about "metadata quorum mode" to avoid > confusion. Hmm. > > > 3. Did we consider using `session` (like the group coordinator) instead > of > > `regsitration` in `broker.registration.timeout.ms`? > > Hmm, broker.session.timeout.ms does sound better. I changed it to that. > > > 4. The flat id space for the controller and broker while requiring a > > different id in embedded mode seems a bit unintuitive. Are there any > other > > systems that do this? I know we covered some of the reasons in the > "Shared > > IDs between Multiple Nodes" rejected alternatives section, but it didn't > > seem totally convincing to me. > > One of my concerns here is that using separate ID spaces for controllers > versus brokers would potentially lead to metrics or logging collisions. We > can take a look at that again once the implementation is further along, I > guess, to see how often that is a problem in practice. > > > 5. With regards to the controller process listening on a separate port, > it > > may be worth adding a sentence about the forwarding KIP as that is a main > > reason why the controller port doesn't need to be accessible. > > Good idea... I added a short reference to KIP-590 in the "Networking" > section. > > > 6. The internal topic seems to be called @metadata. I'm personally not > > convinced about the usage of @ in this way. I think I would go with the > > same convention we have used for other existing internal topics. > > I knew this one would be controversial :) > > I guess the main argument here is that using @ avoids collisions with any > existing topic. Leading underscores, even double underscores, can be used > by users to create new topics, but an "at sign" cannot It would be nice to > have a namespace for system topics that we knew nobody else could break > into. > > > 7. We talk about the metadata.format feature flag. Is this intended to > > allow for single roll upgrades? > > 8. Could the incarnation id be called registration id? Or is there a > reason > > why this would be a bad name? > > I liked "incarnation id" because it expresses the idea that each new > incarnation of the broker gets a different one. I think "registration id" > might be confused with "the broker id is the ID we're registering." > > > 9. Could `CurMetadataOffset` be called `CurrentMetadataOffset` for > > `BrokerRegistrationRequest`? The abbreviation here doesn't seem to help > > much and makes things slightly less readable. It would also make it > > consistent with `BrokerHeartbeatRequest`. > > Yeah, the abbreviated name is inconsistent. I will change it to > CurrentMetadataOffset. > > > 10. Should `UnregisterBrokerRecord` be `DeregisterBrokerRecord`? > > Hmm, "Register/Unregister" is more consistent with "Fence/Unfence" which > is why I went with Unregister. It looks like they're both in the > dictionary, so I'm not sure if "deregister" has an advantage... > > > 11. Broker metrics typically have a PerSec suffix, should we stick with > > that for the `MetadataCommitRate`? > > Added. > > > 12. For the lag metrics, would it be clearer if we included "Offset" in > the > > name? In theory, we could have time based lag metrics too. Having said > > that, existing offset lag metrics do seem to just have `Lag` in their > name > > without further qualification. > > > > Yeah, I think including Offset does make it a bit clearer. Added. > > best, > Colin > > > > Ismael > > > > On Fri, Dec 11, 2020 at 1:41 PM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi all, > > > > > > I'd like to restart the vote on KIP-631: the quorum-based Kafka > > > Controller. The KIP is here: > > > > > > https://cwiki.apache.org/confluence/x/4RV4CQ > > > > > > The original DISCUSS thread is here: > > > > > > > > > > https://lists.apache.org/thread.html/r1ed098a88c489780016d963b065e8cb450a9080a4736457cd25f323c%40%3Cdev.kafka.apache.org%3E > > > > > > There is also a second email DISCUSS thread, which is here: > > > > > > > > > > https://lists.apache.org/thread.html/r1ed098a88c489780016d963b065e8cb450a9080a4736457cd25f323c%40%3Cdev.kafka.apache.org%3E > > > > > > Please take a look and vote if you can. > > > > > > best, > > > Colin > > > > > >