Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Colin McCabe Mon, 14 Dec 2020 17:35:18 -0800

On Fri, Dec 11, 2020, at 17:07, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the reply. Just a couple of more comments below.
> 
> 210. Since we are deprecating zookeeper.connection.timeout.ms, should we
> add a new config to bound the time for a broker to connect to the
> controller during starting up?
>


Good idea.  I added initial.broker.registration.timeout.ms for this.

> 211. BrokerHeartbeat no longer has the state field in the request/response.
> However, (a) the controller shutdown section still has "In its periodic
> heartbeats, the broker asks the controller if it can transition into the
> SHUTDOWN state.  This motivates the controller to move all of the leaders
> off of that broker.  Once they are all moved, the controller responds to
> the heartbeat with a nextState of SHUTDOWN."; (2) the description of
> BrokerHeartbeat still references currentState and targetState.
> 

Thanks.  I've made these sections clearer and removed the obsolete references 
to sending states.

best,
Colin

> Jun
> 
> On Fri, Dec 11, 2020 at 1:33 PM Colin McCabe <[email protected]> wrote:
> 
> > On Wed, Dec 9, 2020, at 10:10, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > Thanks for the update. A few more follow up comments.
> > >
> >
> > Hi Jun,
> >
> > Thanks again for the review.
> >
> > > 100. FailedReplicaRecord: Since this is reported by each broker
> > > independently, perhaps we could use a more concise representation that
> > has
> > > a top level broker field, an array of topics, which has an array of
> > > partitions.
> > >
> >
> > The issue is that there is a size limit on the each record.  Putting all
> > of the partitions of a log directory into a single record would probably
> > break that in many cases.  Still, we can optimize a bit by having an array
> > of partition IDs, since nearly all the time, we have more than one from the
> > same topic.
> >
> > > 200. Sounds good. If we remove the broker-side fencing logic, do we plan
> > to
> > > still keep FENCED in broker state? Do we plan to expose the new states
> > > through the existing BrokerState metric and if so, what are the values
> > for
> > > the new states?
> > >
> >
> > No, we don't need FENCED any more.  I have removed it from the KIP.
> >
> > The new states are very similar to the current ones, actually.  There are
> > no new states or removed ones.  The main change in the broker state machine
> > is that the RECOVERING_FROM_UNCLEAN_SHUTDOWN state has been renamed to
> > RECOVERY.  Also, unlike previously, the broker will always pass through
> > RECOVERY (although it may only stay in this state for a very short amount
> > of time).
> >
> > > 201. This may be fine too. Could we document what happens when the
> > > broker.id/controller.id in metadata.properties don't match the broker
> > > config when the broker starts up?
> > >
> >
> > I added some documentation about this.
> >
> > > 204. There is still "The highest metadata offset which the broker has not
> > > reached" referenced under BrokerRegistration.
> > >
> >
> > It should be CurrentMetadataOffset.  Fixed.
> >
> > > 206. Is that separate step needed given KIP-516? With KIP-516 (
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-LeaderAndIsr
> > ),
> > > we don't need to wait for the topic data to be removed from all brokers
> > > before removing the topic metadata. The combination of unmatching
> > > topicId
> > > or the missing topicId from the metadata is enough for the broker to
> > > clean
> > > up deleted topics asynchronously.
> >
> > It won't be needed once KIP-516 is adopted, but this hasn't been
> > implemented yet.
> >
> > best,
> > Colin
> >
> > >
> > > Jun
> > >
> > >
> > >
> > >
> > > On Tue, Dec 8, 2020 at 5:27 PM Colin McCabe <[email protected]> wrote:
> > >
> > > > On Thu, Dec 3, 2020, at 16:37, Jun Rao wrote:
> > > > > Hi, Colin,
> > > > >
> > > > > Thanks for the updated KIP. A few more comments below.
> > > > >
> > > >
> > > > Hi Jun,
> > > >
> > > > Thanks again for the reviews.
> > > >
> > > > > 80.2 For deprecated configs, we need to include zookeeper.* and
> > > > > broker.id.generation.enable.
> > > > >
> > > >
> > > > Added.
> > > >
> > > > > 83.1 If a broker is down, does the controller keep the previously
> > > > > registered broker epoch forever? If not, how long does the controller
> > > > keep
> > > > > it? What does the controller do when receiving a broker heartbeat
> > request
> > > > > with an unfound broker epoch?
> > > > >
> > > >
> > > > Yes, the controller keeps the previous registration forever.
> > > >
> > > > Broker heartbeat requests with an incorrect broker epoch will be
> > rejected
> > > > with STALE_BROKER_EPOCH.
> > > >
> > > > > 100. Have you figured out if we need to add a new record type for
> > > > reporting
> > > > > partitions on failed disks?
> > > > >
> > > >
> > > > I added FailedReplicaRecord to reflect the case where a JBOD directory
> > has
> > > > failed, leading to failed replicas.
> > > >
> > > > > 102. For debugging purposes, sometimes it's useful to read the
> > metadata
> > > > > topic using tools like console-consumer. Should we support that and
> > if
> > > > so,
> > > > > how?
> > > > >
> > > >
> > > > For now, we have the ability to read the metadata logs with the
> > dump-logs
> > > > tool.  I think we will come up with some other tools in the future as
> > we
> > > > get experience.
> > > >
> > > > > 200. "brokers which are fenced will not appear in MetadataResponses.
> > The
> > > > > broker will not respond to these requests-- instead, it will simply
> > > > > disconnect." If the controller is partitioned off from the brokers,
> > this
> > > > > design will cause every broker to stop accepting new client
> > requests. In
> > > > > contrast, if ZK is partitioned off, the existing behavior is that the
> > > > > brokers can continue to work based on the last known metadata. So, I
> > am
> > > > not
> > > > > sure if we should change the existing behavior because of the bigger
> > > > impact
> > > > > in the new one. Another option is to keep the existing behavior and
> > > > expose
> > > > > a metric for fenced brokers so that the operator could be alerted.
> > > > >
> > > >
> > > > I'm skeptical about how well running without ZK currently works.
> > However,
> > > > I will move the broker-side fencing into a follow-up KIP.  This KIP is
> > > > already pretty large and there is no hard dependency on this.  There
> > may
> > > > also be other ways of accomplishing the positive effects of what
> > > > broker-side fencing, so more discussion is needed.
> > > >
> > > > > 201. I read Ron's comment, but I am still not sure the benefit of
> > keeping
> > > > > broker.id and controller.id in meta.properties. It seems that we are
> > > > just
> > > > > duplicating the same info in two places and have the additional
> > burden of
> > > > > making sure the values in the two places are consistent.
> > > > >
> > > >
> > > > I think the reasoning is that having broker.id protects us against
> > > > accidentally bringing up a broker with a disk from a different
> > broker.  I
> > > > don't feel strongly about this but it seemed simpler to keep it.
> > > >
> > > > > 202. controller.connect.security.protocol: Is this needed since
> > > > > controller.listener.names and listener.security.protocol.map imply
> > the
> > > > > security protocol already?
> > > > >
> > > >
> > > > You're right, this isn't needed.  I'll remove it.
> > > >
> > > > > 203. registration.heartbeat.interval.ms: It defaults to 2k. ZK uses
> > 1/3
> > > > of
> > > > > the session timeout for heartbeat. So, given the default 18k for
> > > > > registration.lease.timeout.ms, should we default
> > > > > registration.heartbeat.interval.ms to 6k?
> > > > >
> > > >
> > > > 6 seconds seems like a pretty long time between heartbeats.  It might
> > be
> > > > useful to know when a broker is missing heartbeats, with less time than
> > > > that.  I provisionally set it to 3 seconds (we can always change
> > later...)
> > > >
> > > > I also changed the name of these configurations to "
> > > > broker.heartbeat.interval.ms" and "broker.registration.timeout.ms" to
> > try
> > > > to clarify them a bit.
> > > >
> > > > > 204. "The highest metadata offset which the broker has not reached."
> > It
> > > > > seems this should be "has reached".
> > > > >
> > > >
> > > > I changed this to "one more than the highest metadata offset which the
> > > > broker has reached."
> > > >
> > > > > 205. UnfenceBrokerRecord and UnregisterBrokerRecord: To me, they
> > seem to
> > > > be
> > > > > the same. Do we need both?
> > > > >
> > > >
> > > > Unregistration means that the broker has been removed from the cluster.
> > > > That is different than unfencing, which marks the broker as active.
> > > >
> > > > > 206. TopicRecord: The Deleting field is used to indicate that the
> > topic
> > > > is
> > > > > being deleted. I am wondering if this is really needed since
> > RemoveTopic
> > > > > already indicates the same thing.
> > > > >
> > > >
> > > > RemoveTopic is the last step, that scrubs all metadata about the topic.
> > > > In order to get to that last step, the topic data needs to removed
> > from all
> > > > brokers (after each broker notices that the topic is being deleted).
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Dec 2, 2020 at 2:50 PM Colin McCabe <[email protected]>
> > wrote:
> > > > >
> > > > > > On Wed, Dec 2, 2020, at 14:07, Ron Dagostino wrote:
> > > > > > > Hi Colin.  Thanks for the updates.  It's now clear to me that
> > brokers
> > > > > > > keep their broker epoch for the life of their JVM -- they
> > register
> > > > > > > once, get their broker epoch in the response, and then never
> > > > > > > re-register again.  Brokers may get fenced, but they keep the
> > same
> > > > > > > broker epoch for the life of their JVM.  The incarnation ID is
> > also
> > > > > > > kept for the life of the JVM but is generated by the broker
> > itself
> > > > > > > upon startup, and the combination of the two allows the
> > Controller to
> > > > > > > act idempotently if any previously-sent registration response
> > gets
> > > > > > > lost.  Makes sense.
> > > > > > >
> > > > > >
> > > > > > Thanks, Ron.  That's a good summary.
> > > > > >
> > > > > > > One thing I wonder about is if it might be helpful for the
> > broker to
> > > > > > > send the Cluster ID as determined from its meta.properties file
> > in
> > > > its
> > > > > > > registration request.  Does it even make sense for the broker to
> > > > > > > successfully register and enter the Fenced state if it has the
> > wrong
> > > > > > > Cluster ID?
> > > > > >
> > > > > > Yeah, that's a good idea.  Let's have the broker pass its cluster
> > ID in
> > > > > > the registration RPC, and then registration can fail if the broker
> > is
> > > > > > configured for the wrong cluster.
> > > > > >
> > > > > > >  The nextMetadatOffset value that the broker communicates
> > > > > > > in its registration request only has meaning within the correct
> > > > > > > cluster, so it feels to me that the Controller should have some
> > way
> > > > to
> > > > > > > perform this sanity check.  There is currently (pre-KIP 500) a
> > check
> > > > > > > in the broker to make sure its configured cluster ID matches the
> > one
> > > > > > > stored in ZooKeeper, and we will have to perform this validation
> > > > > > > somewhere in the KIP-500 world.  If the Controller doesn't do it
> > > > > > > within the registration request then the broker will have to
> > make a
> > > > > > > metadata request to the Controller, retrieve the Cluster ID, and
> > > > > > > perform the check itself.  It feels to me that it might be
> > better for
> > > > > > > the Controller to just do it, and then the broker doesn't have to
> > > > > > > worry about it anymore once it successfully registers.
> > > > > > >
> > > > > > > I also have a question about the broker.id value and
> > > > meta.properties.
> > > > > > > The KIP now says "In version 0 of meta.properties, there is a
> > > > > > > broker.id field.  Version 1 does not have this field.  It is no
> > > > longer
> > > > > > > needed because we no longer support dynamic broker id
> > assignment."
> > > > > > > But then there is an example version 1 meta.properties file that
> > > > shows
> > > > > > > the broker.id value.  I actually wonder if maybe the broker.id
> > value
> > > > > > > would be good to keep in the version 1 meta.properties file
> > because
> > > > it
> > > > > > > currently (pre-KIP 500, version 0) acts as a sanity check to make
> > > > sure
> > > > > > > the broker is using the correct log directory.  Similarly with
> > the
> > > > > > > controller.id value on controllers -- it would allow the same
> > type
> > > > of
> > > > > > > sanity check for quorum controllers.
> > > > > > >
> > > > > >
> > > > > > That's a good point.  I will add broker.id back, and also add
> > > > > > controller.id as a possibility.
> > > > > >
> > > > > > cheers,
> > > > > > Colin
> > > > > >
> > > > > > >
> > > > > > > On Mon, Nov 30, 2020 at 7:41 PM Colin McCabe <[email protected]
> > >
> > > > wrote:
> > > > > > > >
> > > > > > > > On Fri, Oct 23, 2020, at 16:10, Jun Rao wrote:
> > > > > > > > > Hi, Colin,
> > > > > > > > >
> > > > > > > > > Thanks for the reply. A few more comments.
> > > > > > > >
> > > > > > > > Hi Jun,
> > > > > > > >
> > > > > > > > Thanks again for the reply.  Sorry for the long hiatus.  I was
> > on
> > > > > > vacation for a while.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > 55. There is still text that favors new broker registration.
> > > > "When a
> > > > > > broker
> > > > > > > > > first starts up, when it is in the INITIAL state, it will
> > always
> > > > > > "win"
> > > > > > > > > broker ID conflicts.  However, once it is granted a lease, it
> > > > > > transitions
> > > > > > > > > out of the INITIAL state.  Thereafter, it may lose subsequent
> > > > > > conflicts if
> > > > > > > > > its broker epoch is stale.  (See KIP-380 for some background
> > on
> > > > > > broker
> > > > > > > > > epoch.)  The reason for favoring new processes is to
> > accommodate
> > > > the
> > > > > > common
> > > > > > > > > case where a process is killed with kill -9 and then
> > restarted.
> > > > We
> > > > > > want it
> > > > > > > > > to be able to reclaim its old ID quickly in this case."
> > > > > > > > >
> > > > > > > >
> > > > > > > > Thanks for the reminder.  I have clarified the language here.
> > > > > > Hopefully now it is clear that we don't allow quick re-use of
> > broker
> > > > IDs.
> > > > > > > >
> > > > > > > > > 80.1 Sounds good. Could you document that listeners is a
> > required
> > > > > > config
> > > > > > > > > now? It would also be useful to annotate other required
> > configs.
> > > > For
> > > > > > > > > example, controller.connect should be required.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I added a note specifying that these are required.
> > > > > > > >
> > > > > > > > > 80.2 Could you list all deprecated existing configs? Another
> > one
> > > > is
> > > > > > > > > control.plane.listener.name since the controller no longer
> > sends
> > > > > > > > > LeaderAndIsr, UpdateMetadata and StopReplica requests.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I added a section specifying some deprecated configs.
> > > > > > > >
> > > > > > > > > 83.1 It seems that the broker can transition from FENCED to
> > > > RUNNING
> > > > > > without
> > > > > > > > > registering for a new broker epoch. I am not sure how this
> > works.
> > > > > > Once the
> > > > > > > > > controller fences a broker, there is no need for the
> > controller
> > > > to
> > > > > > keep the
> > > > > > > > > boker epoch around. So, if the fenced broker's heartbeat
> > request
> > > > > > with the
> > > > > > > > > existing broker epoch will be rejected, leading the broker
> > back
> > > > to
> > > > > > the
> > > > > > > > > FENCED state again.
> > > > > > > > >
> > > > > > > >
> > > > > > > > The broker epoch refers to the broker registration.  So we DO
> > keep
> > > > the
> > > > > > broker epoch around even while the broker is fenced.
> > > > > > > >
> > > > > > > > The broker epoch changes only when there is a new broker
> > > > > > registration.  Fencing or unfencing the broker doesn't change the
> > > > broker
> > > > > > epoch.
> > > > > > > >
> > > > > > > > > 83.5 Good point on KIP-590. Then should we expose the
> > controller
> > > > for
> > > > > > > > > debugging purposes? If not, we should deprecate the
> > controllerID
> > > > > > field in
> > > > > > > > > MetadataResponse?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think it's OK to expose it for now, with the proviso that it
> > > > won't
> > > > > > be reachable by clients.
> > > > > > > >
> > > > > > > > > 90. We rejected the shared ID with just one reason "This is
> > not a
> > > > > > good idea
> > > > > > > > > because NetworkClient assumes a single ID space.  So if
> > there is
> > > > > > both a
> > > > > > > > > controller 1 and a broker 1, we don't have a way of picking
> > the
> > > > > > "right"
> > > > > > > > > one." This doesn't seem to be a strong reason. For example,
> > we
> > > > could
> > > > > > > > > address the NetworkClient issue with the node type as you
> > pointed
> > > > > > out or
> > > > > > > > > using the negative value of a broker ID as the controller ID.
> > > > > > > > >
> > > > > > > >
> > > > > > > > It would require a lot of code changes to support multiple
> > types of
> > > > > > node IDs.  It's not clear to me that the end result would be
> > better --
> > > > I
> > > > > > tend to think it would be worse, since it would be more complex.
> > In a
> > > > > > similar vein, using negative numbers seems dangerous, since we use
> > > > > > negatives or -1 as "special values" in many places.  For example,
> > -1
> > > > often
> > > > > > represents "no such node."
> > > > > > > >
> > > > > > > > One important thing to keep in mind is that we want to be able
> > to
> > > > > > transition from a broker and a controller being co-located to them
> > no
> > > > > > longer being co-located.  This is much easier to do when they have
> > > > separate
> > > > > > IDs.
> > > > > > > >
> > > > > > > > > 100. In KIP-589
> > > > > > > > > <
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > > > >,
> > > > > > > > > the broker reports all offline replicas due to a disk
> > failure to
> > > > the
> > > > > > > > > controller. It seems this information needs to be persisted
> > to
> > > > the
> > > > > > > > > metadata
> > > > > > > > > log. Do we have a corresponding record for that?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hmm, I have to look into this a little bit more.  We may need
> > a new
> > > > > > record type.
> > > > > > > >
> > > > > > > > > 101. Currently, StopReplica request has 2 modes, without
> > deletion
> > > > > > and with
> > > > > > > > > deletion. The former is used for controlled shutdown and
> > handling
> > > > > > disk
> > > > > > > > > failure, and causes the follower to stop. The latter is for
> > topic
> > > > > > deletion
> > > > > > > > > and partition reassignment, and causes the replica to be
> > deleted.
> > > > > > Since we
> > > > > > > > > are deprecating StopReplica, could we document what triggers
> > the
> > > > > > stopping
> > > > > > > > > of a follower and the deleting of a replica now?
> > > > > > > > >
> > > > > > > >
> > > > > > > > RemoveTopic triggers deletion.  In general the functionality of
> > > > > > StopReplica is subsumed by the metadata records.
> > > > > > > >
> > > > > > > > > 102. Should we include the metadata topic in the
> > > > MetadataResponse?
> > > > > > If so,
> > > > > > > > > when it will be included and what will the metadata response
> > look
> > > > > > like?
> > > > > > > > >
> > > > > > > >
> > > > > > > > No, it won't be included in the metadata response sent back
> > from
> > > > the
> > > > > > brokers.
> > > > > > > >
> > > > > > > > > 103. "The active controller assigns the broker a new broker
> > > > epoch,
> > > > > > based on
> > > > > > > > > the latest committed offset in the log." This seems
> > inaccurate
> > > > since
> > > > > > the
> > > > > > > > > latest committed offset doesn't always advance on every log
> > > > append.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Given that the new broker epoch won't be visible until the
> > commit
> > > > has
> > > > > > happened, I have changed this to "the next available offset in the
> > log"
> > > > > > > >
> > > > > > > > > 104. REGISTERING(1) : It says "Otherwise, the broker moves
> > into
> > > > the
> > > > > > FENCED
> > > > > > > > > state.". It seems this should be RUNNING?
> > > > > > > > >
> > > > > > > > > 105. RUNNING: Should we require the broker to catch up to the
> > > > > > metadata log
> > > > > > > > > to get into this state?
> > > > > > > >
> > > > > > > > For 104 and 105, these sections have been reworked.
> > > > > > > >
> > > > > > > > best,
> > > > > > > > Colin
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 23, 2020 at 1:20 PM Colin McCabe <
> > [email protected]
> > > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > On Wed, Oct 21, 2020, at 05:51, Tom Bentley wrote:
> > > > > > > > > > > Hi Colin,
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote:
> > > > > > > > > > > > > Hi Colin.  Thanks for the hard work on this KIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have some questions about what happens to a broker
> > > > when it
> > > > > > becomes
> > > > > > > > > > > > > fenced (e.g. because it can't send a heartbeat
> > request to
> > > > > > keep its
> > > > > > > > > > > > > lease).  The KIP says "When a broker is fenced, it
> > cannot
> > > > > > process any
> > > > > > > > > > > > > client requests.  This prevents brokers which are not
> > > > > > receiving
> > > > > > > > > > > > > metadata updates or that are not receiving and
> > processing
> > > > > > them fast
> > > > > > > > > > > > > enough from causing issues to clients." And in the
> > > > > > description of the
> > > > > > > > > > > > > FENCED(4) state it likewise says "While in this
> > state,
> > > > the
> > > > > > broker
> > > > > > > > > > does
> > > > > > > > > > > > > not respond to client requests."  It makes sense
> > that a
> > > > > > fenced broker
> > > > > > > > > > > > > should not accept producer requests -- I assume any
> > such
> > > > > > requests
> > > > > > > > > > > > > would result in NotLeaderOrFollowerException.  But
> > what
> > > > > > about KIP-392
> > > > > > > > > > > > > (fetch from follower) consumer requests?  It is
> > > > conceivable
> > > > > > that
> > > > > > > > > > these
> > > > > > > > > > > > > could continue.  Related to that, would a fenced
> > broker
> > > > > > continue to
> > > > > > > > > > > > > fetch data for partitions where it thinks it is a
> > > > follower?
> > > > > > Even if
> > > > > > > > > > > > > it rejects consumer requests it might still continue
> > to
> > > > > > fetch as a
> > > > > > > > > > > > > follower.  Might it be helpful to clarify both
> > decisions
> > > > > > here?
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Ron,
> > > > > > > > > > > >
> > > > > > > > > > > > Good question.  I think a fenced broker should
> > continue to
> > > > > > fetch on
> > > > > > > > > > > > partitions it was already fetching before it was
> > fenced,
> > > > > > unless it
> > > > > > > > > > hits a
> > > > > > > > > > > > problem.  At that point it won't be able to continue,
> > > > since it
> > > > > > doesn't
> > > > > > > > > > have
> > > > > > > > > > > > the new metadata.  For example, it won't know about
> > > > leadership
> > > > > > changes
> > > > > > > > > > in
> > > > > > > > > > > > the partitions it's fetching.  The rationale for
> > > > continuing to
> > > > > > fetch
> > > > > > > > > > is to
> > > > > > > > > > > > try to avoid disruptions as much as possible.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't think fenced brokers should accept client
> > requests.
> > > > > > The issue
> > > > > > > > > > is
> > > > > > > > > > > > that the fenced broker may or may not have any data it
> > is
> > > > > > supposed to
> > > > > > > > > > > > have.  It may or may not have applied any configuration
> > > > > > changes, etc.
> > > > > > > > > > that
> > > > > > > > > > > > it is supposed to have applied.  So it could get pretty
> > > > > > confusing, and
> > > > > > > > > > also
> > > > > > > > > > > > potentially waste the client's time.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > When fenced, how would the broker reply to a client
> > which did
> > > > > > make a
> > > > > > > > > > > request?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi Tom,
> > > > > > > > > >
> > > > > > > > > > The broker will respond with a retryable error in that
> > case.
> > > > Once
> > > > > > the
> > > > > > > > > > client has re-fetched its metadata, it will no longer see
> > the
> > > > > > fenced broker
> > > > > > > > > > as part of the cluster.  I added a note to the KIP.
> > > > > > > > > >
> > > > > > > > > > best,
> > > > > > > > > > Colin
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Tom
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Reply via email to