Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2021-02-08 Thread Colin McCabe
Hi Gwen, Thanks for taking a look. Yes, I do think unregisterBroker would be a better name than decommissionBroker. Let's do that, and also rename the associated RPC. cheers, Colin On Sat, Feb 6, 2021, at 21:13, Gwen Shapira wrote: > I realize that I'm very very late to a pretty big party,

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2021-02-06 Thread Gwen Shapira
I realize that I'm very very late to a pretty big party, but I was just looking at the new code (super readable, btw), and noticed that the inverse API for registerBroker() is called decomissionBroker(). It would be way clearer that we are undoing a registration is we rename it to unregisterBroker(

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-18 Thread Jun Rao
Hi, Colin, Thanks for the reply. The KIP looks good to me now. Thanks for your diligence. Jun On Thu, Dec 17, 2020 at 5:15 PM Colin McCabe wrote: > On Thu, Dec 17, 2020, at 17:02, Jun Rao wrote: > > Hi, Colin, > > > > Thanks for the reply. Sounds good. One more comment. > > > > 231. Currently,

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Colin McCabe
On Thu, Dec 17, 2020, at 17:02, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. Sounds good. One more comment. > > 231. Currently, we have sasl.mechanism.inter.broker.protocol for inter > broker communication. It seems that we need a similar config for specifying > the sasl mechanism for th

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Jun Rao
Hi, Colin, Thanks for the reply. Sounds good. One more comment. 231. Currently, we have sasl.mechanism.inter.broker.protocol for inter broker communication. It seems that we need a similar config for specifying the sasl mechanism for the communication between the broker and the controller. Jun

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Colin McCabe
On Thu, Dec 17, 2020, at 10:19, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. > > 211. Hmm, I still don't see a clear benefit of registering a broker before > recovery. It's possible for the recovery to take time. However, during the > recovery mode, it seems the broker will still be in t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Jun Rao
Hi, Colin, Thanks for the reply. 211. Hmm, I still don't see a clear benefit of registering a broker before recovery. It's possible for the recovery to take time. However, during the recovery mode, it seems the broker will still be in the fenced mode and won't be able to do work for the clients.

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Colin McCabe
On Wed, Dec 16, 2020, at 18:13, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. Just a couple of more comments. > > 211. Currently, the broker only registers itself in ZK after log recovery. > Is there any benefit to change that? As you mentioned, the broker can't do > much before completin

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Jun Rao
Hi, Colin, Thanks for the reply. Just a couple of more comments. 211. Currently, the broker only registers itself in ZK after log recovery. Is there any benefit to change that? As you mentioned, the broker can't do much before completing log recovery. 230. Regarding MetadataResponse, there is a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Colin McCabe
On Wed, Dec 16, 2020, at 13:40, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few follow up comments. > > 211. When does the broker send the BrokerRegistration request to the > controller? Is it after the recovery phase? If so, at that point, the > broker has already caught up on the me

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few follow up comments. 211. When does the broker send the BrokerRegistration request to the controller? Is it after the recovery phase? If so, at that point, the broker has already caught up on the metadata (in order to clean up deleted partitions). Then, it se

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Colin McCabe
On Wed, Dec 16, 2020, at 09:59, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few more comments below. > Hi Jun, Thanks for the comments. > 206. "RemoveTopic is the last step, that scrubs all metadata about the > topic. In order to get to that last step, the topic data needs to remo

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few more comments below. 206. "RemoveTopic is the last step, that scrubs all metadata about the topic. In order to get to that last step, the topic data needs to removed from all brokers (after each broker notices that the topic is being deleted)." Currently, t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-16 Thread Colin McCabe
On Tue, Dec 15, 2020, at 13:08, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few more follow up comments. > > 210. initial.broker.registration.timeout.ms: The default value is 90sec, > which seems long. If a broker fails the registration because of incorrect > configs, we want to fail

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-15 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few more follow up comments. 210. initial.broker.registration.timeout.ms: The default value is 90sec, which seems long. If a broker fails the registration because of incorrect configs, we want to fail faster. In comparison, the defaults for zookeeper.connection.

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-15 Thread Colin McCabe
On Tue, Dec 15, 2020, at 04:13, Tom Bentley wrote: > Hi Colin, > > The KIP says that "brokers which are fenced will not appear in > MetadataResponses. So clients that have up-to-date metadata will not try > to contact fenced brokers.", which is fine, but it doesn't seem to > elaborate on what hap

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-15 Thread Tom Bentley
Hi Colin, The KIP says that "brokers which are fenced will not appear in MetadataResponses. So clients that have up-to-date metadata will not try to contact fenced brokers.", which is fine, but it doesn't seem to elaborate on what happens for clients which try to contact a broker (using stale met

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-14 Thread Colin McCabe
On Fri, Dec 11, 2020, at 17:07, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. Just a couple of more comments below. > > 210. Since we are deprecating zookeeper.connection.timeout.ms, should we > add a new config to bound the time for a broker to connect to the > controller during starting

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-11 Thread Jun Rao
Hi, Colin, Thanks for the reply. Just a couple of more comments below. 210. Since we are deprecating zookeeper.connection.timeout.ms, should we add a new config to bound the time for a broker to connect to the controller during starting up? 211. BrokerHeartbeat no longer has the state field in t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-11 Thread Colin McCabe
On Wed, Dec 9, 2020, at 10:10, Jun Rao wrote: > Hi, Colin, > > Thanks for the update. A few more follow up comments. > Hi Jun, Thanks again for the review. > 100. FailedReplicaRecord: Since this is reported by each broker > independently, perhaps we could use a more concise representation that

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-09 Thread Jun Rao
Hi, Colin, Thanks for the update. A few more follow up comments. 100. FailedReplicaRecord: Since this is reported by each broker independently, perhaps we could use a more concise representation that has a top level broker field, an array of topics, which has an array of partitions. 200. Sounds

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-08 Thread Colin McCabe
On Thu, Dec 3, 2020, at 16:37, Jun Rao wrote: > Hi, Colin, > > Thanks for the updated KIP. A few more comments below. > Hi Jun, Thanks again for the reviews. > 80.2 For deprecated configs, we need to include zookeeper.* and > broker.id.generation.enable. > Added. > 83.1 If a broker is down,

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-03 Thread Jun Rao
Hi, Colin, Thanks for the updated KIP. A few more comments below. 80.2 For deprecated configs, we need to include zookeeper.* and broker.id.generation.enable. 83.1 If a broker is down, does the controller keep the previously registered broker epoch forever? If not, how long does the controller k

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-02 Thread Colin McCabe
On Wed, Dec 2, 2020, at 14:07, Ron Dagostino wrote: > Hi Colin. Thanks for the updates. It's now clear to me that brokers > keep their broker epoch for the life of their JVM -- they register > once, get their broker epoch in the response, and then never > re-register again. Brokers may get fence

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-02 Thread Ron Dagostino
Hi Colin. Thanks for the updates. It's now clear to me that brokers keep their broker epoch for the life of their JVM -- they register once, get their broker epoch in the response, and then never re-register again. Brokers may get fenced, but they keep the same broker epoch for the life of their

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-11-30 Thread Colin McCabe
On Fri, Oct 23, 2020, at 16:10, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few more comments. Hi Jun, Thanks again for the reply. Sorry for the long hiatus. I was on vacation for a while. > > 55. There is still text that favors new broker registration. "When a broker > first st

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-11-10 Thread Ron Dagostino
Hi Coln. Ignore my previous question about ConfigRecord.ResourceType having to be a string -- I now see that org.apache.kafka.common.config.ConfigResource defines the types of configs for an int8. I do have a question about how the broker will connect to the controller. The KIP says that control

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-28 Thread Ron Dagostino
HI again, Colin. I just noticed that both ConfigRecord and AccessControlRecord have a ResourceType of type int8. I thought that config resources are in the set {topics, clients, users, brokers} and ACL resource types are a different set as defined by the org.apache.kafka.common.resource.ResourceT

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-25 Thread Ron Dagostino
Hi Colin and Jun. Regarding these issues: 83.1 It seems that the broker can transition from FENCED to RUNNING without registering for a new broker epoch. I am not sure how this works. Once the controller fences a broker, there is no need for the controller to keep the boker epoch around. So, if t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-24 Thread Tom Bentley
Hi Colin, Which error code in particular though? Because so far as I'm aware there's no existing error code which really captures this situation and creating a new one would not be backward compatible. Cheers, Tom On Sat, Oct 24, 2020 at 12:20 AM Jun Rao wrote: > Hi, Colin, > > Thanks for the

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-23 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few more comments. 55. There is still text that favors new broker registration. "When a broker first starts up, when it is in the INITIAL state, it will always "win" broker ID conflicts. However, once it is granted a lease, it transitions out of the INITIAL sta

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-23 Thread Colin McCabe
On Wed, Oct 21, 2020, at 05:51, Tom Bentley wrote: > Hi Colin, > > On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote: > > > Hi Colin. Thanks for the hard work on this KIP. > > > > > > I have some questions about what happens to a broker when it becomes > > > fenced (e.g. because it can't send a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-23 Thread Colin McCabe
Hi Ron, Thanks for the questions. Talking about "broker state" can get confusing because there's lots of different concepts that we refer to as "broker state." For example, there is a JMX metric that we expose, whose values are given in BrokerStates.scala. However, for the purpose of this dis

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-22 Thread Ron Dagostino
Hi again, Colin. Related to the issue of how to communicate fenced vs. not fenced (and how to communicate or imply Offline vs not Offline), do we need to communicate in the log that a broker has gracefully shut down? We do distinguish between a broker being unavailable due to a controlled shutdow

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-22 Thread Ron Dagostino
HI Colin. A FencedBrokerRecord appears in the metadata log when a broker is fenced. What appears in the metadata log to indicate that a broker is no longer fenced? Does a BrokerRecord appear? That seems to communicate a bunch of unnecessary data in this context (endpoints, features, rack). If

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-21 Thread Colin McCabe
On Tue, Oct 13, 2020, at 18:30, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few more comments below. > > 80.1 controller.listener.names only defines the name of the listener. The > actual listener including host/port/security_protocol is typically defined > in advertised_listners. Doe

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-21 Thread Tom Bentley
Hi Colin, On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote: > > Hi Colin. Thanks for the hard work on this KIP. > > > > I have some questions about what happens to a broker when it becomes > > fenced (e.g. because it can't send a heartbeat request to keep its > > lease). The KIP says "When a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-21 Thread Colin McCabe
On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote: > Hi Colin. Thanks for the hard work on this KIP. > > I have some questions about what happens to a broker when it becomes > fenced (e.g. because it can't send a heartbeat request to keep its > lease). The KIP says "When a broker is fenced, it

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-19 Thread Ron Dagostino
Hi Colin. Thanks for the hard work on this KIP. I have some questions about what happens to a broker when it becomes fenced (e.g. because it can't send a heartbeat request to keep its lease). The KIP says "When a broker is fenced, it cannot process any client requests. This prevents brokers whi

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-13 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few more comments below. 80.1 controller.listener.names only defines the name of the listener. The actual listener including host/port/security_protocol is typically defined in advertised_listners. Does that mean advertised_listners is a required config now? 83

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-12 Thread Colin McCabe
On Tue, Oct 6, 2020, at 16:09, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. Made another pass of the KIP. A few more comments > below. > Hi Jun, Thanks for the review. > 55. We discussed earlier why the current behavior where we favor the > current broker registration is better. Have

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-06 Thread Jun Rao
Hi, Colin, Thanks for the reply. Made another pass of the KIP. A few more comments below. 55. We discussed earlier why the current behavior where we favor the current broker registration is better. Have you given this more thought? 80. Config related. 80.1 Currently, each broker only has the fol

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-10-05 Thread Colin McCabe
On Mon, Sep 28, 2020, at 11:41, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. A few more comments below. > > 62. > 62.1 controller.listener.names: So, is this used for the controller or the > broker trying to connect to the controller? > Hi Jun, It's used by both. The broker tries to c

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-30 Thread Colin McCabe
On Tue, Sep 29, 2020, at 17:43, Jason Gustafson wrote: > Hey Colin, > > Thanks for the hard work on this proposal. > > I'm gradually coming over to the idea of the controllers having separate > IDs. One of the benefits is that it allows us to separate the notion of > controller liveness from brok

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-29 Thread Jason Gustafson
Hey Colin, Thanks for the hard work on this proposal. I'm gradually coming over to the idea of the controllers having separate IDs. One of the benefits is that it allows us to separate the notion of controller liveness from broker liveness, which has always been a tricky detail. I think it's fair

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-29 Thread Jose Garcia Sancio
Hi Jun and Colin, Some comments below. > 62.3 We added some configs in KIP-595 prefixed with "quorum" and we plan to > add some controller specific configs prefixed with "controller". KIP-630 > plans to add some other controller specific configs with no prefix. Should > we standardize all control

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-28 Thread Jun Rao
Hi, Colin, Thanks for the reply. A few more comments below. 62. 62.1 controller.listener.names: So, is this used for the controller or the broker trying to connect to the controller? 62.2 If we want to take the approach to share the configs that are common between the broker and the controller, s

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-28 Thread Colin McCabe
On Fri, Sep 25, 2020, at 17:35, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. > > 62. Thinking about this more, I am wondering what's our overall strategy > for configs shared between the broker and the controller. For example, both > the broker and the controller have to define listeners

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-25 Thread Jun Rao
Hi, Colin, Thanks for the reply. 62. Thinking about this more, I am wondering what's our overall strategy for configs shared between the broker and the controller. For example, both the broker and the controller have to define listeners. So both need configs like listeners/advertised.listeners. B

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-25 Thread Colin McCabe
On Fri, Sep 25, 2020, at 10:17, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. > > 60. Yes, I think you are right. We probably need the controller id when a > broker starts up. A broker only stores the Raft leader id in the metadata > file. To do the initial fetch to the Raft leader, it ne

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-25 Thread Jun Rao
Hi, Colin, Thanks for the reply. 60. Yes, I think you are right. We probably need the controller id when a broker starts up. A broker only stores the Raft leader id in the metadata file. To do the initial fetch to the Raft leader, it needs to know the host/port associated with the leader id. 62.

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Colin McCabe
On Thu, Sep 24, 2020, at 16:24, Jun Rao wrote: > Hi, Colin, > > Thanks for the reply and the updated KIP. A few more comments below. > Hi Jun, > > 53. It seems that you already incorporated the changes in KIP-516. With > topic ids, we don't need to wait for the topic's data to be deleted before

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Jun Rao
Hi, Colin, Thanks for the reply and the updated KIP. A few more comments below. 53. It seems that you already incorporated the changes in KIP-516. With topic ids, we don't need to wait for the topic's data to be deleted before removing the topic metadata. If the topic is recreated, we can still d

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Colin McCabe
On Mon, Sep 21, 2020, at 18:13, Jun Rao wrote: > Hi, Colin, > > Sorry for the late reply. A few more comments below. > Hi Jun, Thanks for taking another look. > > 50. Configurations > 50.1 controller.listeners: It seems that a controller just needs one > listener. Why do we need to have a list

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-21 Thread Jun Rao
Hi, Colin, Sorry for the late reply. A few more comments below. 50. Configurations 50.1 controller.listeners: It seems that a controller just needs one listener. Why do we need to have a list here? Also, could you provide an example of how this is set and what's its relationship with existing con

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-17 Thread Colin McCabe
Hi Unmesh, That's a fair point. I have moved the lease duration to the broker heartbeat response. That way lease durations can be changed just be reconfiguring the controllers. best, Colin On Wed, Sep 16, 2020, at 07:40, Unmesh Joshi wrote: > Thanks Colin, the changes look good to me. One sm

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-16 Thread Unmesh Joshi
Thanks Colin, the changes look good to me. One small thing. registration.lease.timeout.ms is the configuration on the controller side. It will be good to comment on how brokers know about it, to be able to send LeaseDurationMs in the heartbeat request, or else it can be added in the heartbeat respo

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-11 Thread Colin McCabe
Hi Unmesh, I think you're right that we should use a duration here rather than a time. As you said, the clock on the controller will probably not match the one on the broker. I have updated the KIP. > > It's important to keep in mind that messages may be delayed in the > > network, or arrive

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-04 Thread Unmesh Joshi
Hi Colin, Thanks for the response. >>If the start time is not specified in the request, then the network time is excluded from the heartbeat time. The most common implementation pattern I see (looking at Google Chubby sessions, Zookeeper sessions, etcd lease implementation, LogCabin session and Doc

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-04 Thread Colin McCabe
> Colin wrote: > > The reason for including LeaseStartTimeMs in the request is to ensure > > that the time required to communicate with the controller gets included in > > the lease time. Since requests can potentially be delayed in the network > > for a long time, this is important. On Mon, Aug

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-01 Thread Unmesh Joshi
>>Yes, I agree that the lease timeout on the controller side should be reset.The alternative would be to track the lease as hard state rather than soft state, but I think that is not really >> >> needed, and would result in more log entries. On the related note, I think it will be good to add some

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-31 Thread Unmesh Joshi
>>The reason for including LeaseStartTimeMs in the request is to ensure that the time required to communicate with the controller gets included in >>the lease time. Since requests can potentially be delayed in the network for a long time, this is important. The network time will be added anyway, b

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-31 Thread Colin McCabe
On Sat, Aug 29, 2020, at 01:12, Unmesh Joshi wrote: > >>>Can you repeat your questions about broker leases? > > The LeaseStartTimeMs is expected to be the broker's > 'System.currentTimeMillis()' at the point of the request. The active > controller will add its lease period to this in order >>>

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-29 Thread Unmesh Joshi
>>>Can you repeat your questions about broker leases? The LeaseStartTimeMs is expected to be the broker's 'System.currentTimeMillis()' at the point of the request. The active controller will add its lease period to this in order to compute the LeaseEndTimeMs. I think the use of LeaseStart

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-28 Thread Colin McCabe
On Fri, Aug 28, 2020, at 19:36, Unmesh Joshi wrote: > Hi Colin, > > There were a few of questions I had.. Hi Unmesh, Thanks for the response. > > 1. Were my comments on the broker lease implementation (and corresponding > prototype) appropriate and do we need to change the KIP > description acc

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-28 Thread Unmesh Joshi
Hi Colin, There were a few of questions I had.. 1. Were my comments on the broker lease implementation (and corresponding prototype) appropriate and do we need to change the KIP description accordingly?. 2. How will broker epochs be generated? I am assuming it can be the committed log offset (like

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-28 Thread Colin McCabe
Hi all, I'm thinking of calling a vote on KIP-631 on Monday. Let me know if there's any more comments I should address before I start the vote. cheers, Colin On Tue, Aug 11, 2020, at 05:39, Unmesh Joshi wrote: > >>Hi Unmesh, > >>Thanks, I'll take a look. > Thanks. I will be adding more to the

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-11 Thread Unmesh Joshi
>>Hi Unmesh, >>Thanks, I'll take a look. Thanks. I will be adding more to the prototype and will be happy to help and collaborate. Thanks, Unmesh On Tue, Aug 11, 2020 at 12:28 AM Colin McCabe wrote: > Hi Jose, > > That'a s good point that I hadn't considered. It's probably worth having > a sep

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-10 Thread Colin McCabe
Hi Jose, That'a s good point that I hadn't considered. It's probably worth having a separate leader change message, as you mentioned. Hi Unmesh, Thanks, I'll take a look. best, Colin On Fri, Aug 7, 2020, at 11:56, Jose Garcia Sancio wrote: > Hi Unmesh, > > Very cool prototype! > > Hi Coli

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-07 Thread Jose Garcia Sancio
Hi Unmesh, Very cool prototype! Hi Colin, The KIP proposes a record called IsrChange which includes the partition, topic, isr, leader and leader epoch. During normal operation ISR changes do not result in leader changes. Similarly, leader changes do not necessarily involve ISR changes. The contr

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-05 Thread Unmesh Joshi
Hi, I have built a small prototype of how the controller backed by consensus implementation will look like. https://github.com/unmeshjoshi/distrib-broker/blob/master/src/main/scala/com/dist/simplekafka/kip500/Kip631Controller.scala This is a miniature Kafka like implementation I use to teach distri

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Unmesh Joshi
Hi, The LeaseStartTimeMs is expected to be the broker's 'System.currentTimeMillis()' at the point of the request. The active controller will add its lease period to this in order to compute the LeaseEndTimeMs. I think this is a bit confusing. Monotonic clock on the active controller shoul

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Colin McCabe
On Mon, Aug 3, 2020, at 15:51, Jose Garcia Sancio wrote: > Thanks for the KIP Colin, > > Here is a partial review: > > > 1. Even when a broker and a controller are co-located in the same JVM, they > > must > > have different node IDs > > Why? What problem are you trying to solve? > Hi Jose,

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Jose Garcia Sancio
Thanks for the KIP Colin, Here is a partial review: 1. > Even when a broker and a controller are co-located in the same JVM, they must > have different node IDs Why? What problem are you trying to solve? 2. > Node IDs must be set in the configuration file for brokers and controllers. I unders

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Colin McCabe
On Mon, Aug 3, 2020, at 01:48, Unmesh Joshi wrote: > Just a quick comment. The description of the heartbeat in this KIP is more > about maintaining a 'time bound lease' with broker details, for group > membership and failure detection of the brokers. That way they are more > than just heartbeats, a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Colin McCabe
On Mon, Jul 27, 2020, at 09:20, Jun Rao wrote: > Hi, Colin, > > Thanks for the KIP. A few comments below. > Hi Jun, Thanks for the review. Sorry that it took me a while to respond. I wanted to carefully re-read the latest version of KIP-595 first as well as the other discussions. > > 10. S

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-03 Thread Unmesh Joshi
We could combine the heartbeat with the fetch request. It would basically mean moving all the heartbeat fields into the fetch request. As the KIP says, this would be pretty messy. Another reason why it would be messy is because of the timing. Fetch requests can get delayed when they're

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-29 Thread Colin McCabe
On Thu, Jul 23, 2020, at 23:02, Boyang Chen wrote: > Hey Colin, > > some more questions I have about the proposal: > > 1. We mentioned in the networking section that "The only time when clients > should contact a controller node directly is when they are debugging system > issues". But later we d

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-29 Thread Unmesh Joshi
Hi, In the BrokerHeartbeat request and response, what is the reason to have LeaseStartTimeMs and LeaseEndTimeMs respectively? There are two points I was thinking of 1. The time used to track lease expiry will be monotonic clock on the active controller. So it won't be useful to use that value o

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-27 Thread Jun Rao
Hi, Colin, Thanks for the KIP. A few comments below. 10. Some of the choices in this KIP are not consistent with KIP-595. It would be useful to make consistent choices between the two KIPs. 10.1 KIP-595 doesn't use a separate Heartbeat request and heartbeat is piggybacked through the Fetch reques

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-23 Thread Boyang Chen
Hey Colin, some more questions I have about the proposal: 1. We mentioned in the networking section that "The only time when clients should contact a controller node directly is when they are debugging system issues". But later we didn't talk about how to enable this debug mode, could you conside

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-15 Thread Colin McCabe
On Mon, Jul 13, 2020, at 11:08, Boyang Chen wrote: > Hey Colin, some quick questions, > > 1. I looked around and didn't find a config for broker heartbeat interval, > are we piggy-back on some existing configs? > Good point. I meant to add this, but I forgot. I added registration.heartbeat.int

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-15 Thread Colin McCabe
On Mon, Jul 13, 2020, at 12:48, Jason Gustafson wrote: > Let me add a little more color to my question about the controller.id. I > have been assuming that each process would run a single Raft replication > thread for the metadata quorum, just that its role might be different. Some > would be voter

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-13 Thread Jason Gustafson
Let me add a little more color to my question about the controller.id. I have been assuming that each process would run a single Raft replication thread for the metadata quorum, just that its role might be different. Some would be voters and some would be observers. To me, that suggests that broker

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-13 Thread Boyang Chen
Hey Colin, some quick questions, 1. I looked around and didn't find a config for broker heartbeat interval, are we piggy-back on some existing configs? 2. We only mentioned that the lease time is 10X of the heartbeat interval, could we also include why we chose this value? On Mon, Jul 13, 2020 at

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-13 Thread Jason Gustafson
Hi Colin, Thanks for the proposal. A few initial comments comments/questions below: 1. I don't follow why we need a separate configuration for `controller.listeners`. The current listener configuration already allows users to specify multiple listeners, which allows them to define internal endpoi

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-12 Thread Colin McCabe
Hi Unmesh, That's an interesting idea, but I think it would be best to strive for single metadata events that are complete in themselves, rather than trying to do something transactional or EOS-like. For example, we could have a create event that contains all the partitions to be created. bes

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-12 Thread Colin McCabe
On Fri, Jul 10, 2020, at 18:15, Guozhang Wang wrote: > Hello Colin, > > Thanks for the nice written KIP. A few meta comments: > > 1) We need to talk a bit about broker failure detection: is that piggy > backed with fencing? i.e. should the controller immediately migrate leader > partitions from t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-10 Thread Guozhang Wang
Hello Colin, Thanks for the nice written KIP. A few meta comments: 1) We need to talk a bit about broker failure detection: is that piggy backed with fencing? i.e. should the controller immediately migrate leader partitions from the fenced brokers? On one side, when a broker is fenced it cannot t

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-10 Thread Unmesh Joshi
I was thinking that we might need something like multi-operation record in zookeeper to atomically create topic and partition records when this multi record is committed. This way metadata will have both the TopicRecord and PartitionRecord toge

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Colin McCabe
Hi Unmesh, Yes, once the last stable offset advanced, we would consider the topic creation to be done, and then we could return success to the client. best, Colin On Thu, Jul 9, 2020, at 19:44, Unmesh Joshi wrote: > It still needs HighWaterMark / LastStableOffset to be advanced by two > records

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Unmesh Joshi
It still needs HighWaterMark / LastStableOffset to be advanced by two records? Something like following? || <--|| HighWaterMark Response|PartitionRecord | ||

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Colin McCabe
On Thu, Jul 9, 2020, at 10:40, Tom Bentley wrote: > Hi Colin, > > Thanks for the detailed KIP. > > > "As described in KIP-500, the controller will store its data in the > > internal __kafka_metadata topic." > > The KIP doesn't really explain the extent to which this is a normal topic. > I know y

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Colin McCabe
On Thu, Jul 9, 2020, at 04:37, Unmesh Joshi wrote: > I see that, when a new topic is created, two metadata records, a > TopicRecord (just the name and id of the topic) and a PartitionRecord (more > like LeaderAndIsr, with leader id and replica ids for the partition) are > created. > While creating

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Tom Bentley
Hi Colin, Thanks for the detailed KIP. >> "As described in KIP-500, the controller will store its data in the internal __kafka_metadata topic." The KIP doesn't really explain the extent to which this is a normal topic. I know you say in the Networking section that the only time a client should c

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-09 Thread Unmesh Joshi
I see that, when a new topic is created, two metadata records, a TopicRecord (just the name and id of the topic) and a PartitionRecord (more like LeaderAndIsr, with leader id and replica ids for the partition) are created. While creating the topic, log entries for both the records need to be commit

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-08 Thread Colin McCabe
On Tue, Jul 7, 2020, at 18:27, Ron Dagostino wrote: > HI Colin. Thanks for the KIP. Here is some feedback and various questions. > Hi Ron, Thanks for taking a look! > "*Controller processes will listen on a separate port from brokers. This > will be true even when the broker and controller a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-07 Thread Ron Dagostino
HI Colin. Thanks for the KIP. Here is some feedback and various questions. "*Controller processes will listen on a separate port from brokers. This will be true even when the broker and controller are co-located in the same JVM*". I assume it is possible that the port numbers could be the same