Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Tom Bentley Tue, 15 Dec 2020 04:14:29 -0800

Hi Colin,

The KIP says that "brokers which are fenced will not appear in
MetadataResponses.  So clients that have up-to-date metadata will not try
to contact fenced brokers.", which is fine, but it doesn't seem to
elaborate on what happens for clients which try to contact a broker (using
stale metadata, for example) and find it in the registered-but-fenced
state. You said previously that the broker will respond with a retriable
error, which is again fine, but you didn't answer my question about which
error code you've chosen for this. I realise that this doesn't really
affect the correctness of the behaviour, but I'm not aware of any existing
error code which is a good fit. So it would be good to understand about how
you're making this backward compatible for clients.


Many thanks,

Tom

On Tue, Dec 15, 2020 at 1:42 AM Colin McCabe <[email protected]> wrote:

> On Fri, Dec 11, 2020, at 17:07, Jun Rao wrote:
> > Hi, Colin,
> >
> > Thanks for the reply. Just a couple of more comments below.
> >
> > 210. Since we are deprecating zookeeper.connection.timeout.ms, should we
> > add a new config to bound the time for a broker to connect to the
> > controller during starting up?
> >
>
> Good idea.  I added initial.broker.registration.timeout.ms for this.
>
> > 211. BrokerHeartbeat no longer has the state field in the
> request/response.
> > However, (a) the controller shutdown section still has "In its periodic
> > heartbeats, the broker asks the controller if it can transition into the
> > SHUTDOWN state.  This motivates the controller to move all of the leaders
> > off of that broker.  Once they are all moved, the controller responds to
> > the heartbeat with a nextState of SHUTDOWN."; (2) the description of
> > BrokerHeartbeat still references currentState and targetState.
> >
>
> Thanks.  I've made these sections clearer and removed the obsolete
> references to sending states.
>
> best,
> Colin
>
> > Jun
> >
> > On Fri, Dec 11, 2020 at 1:33 PM Colin McCabe <[email protected]> wrote:
> >
> > > On Wed, Dec 9, 2020, at 10:10, Jun Rao wrote:
> > > > Hi, Colin,
> > > >
> > > > Thanks for the update. A few more follow up comments.
> > > >
> > >
> > > Hi Jun,
> > >
> > > Thanks again for the review.
> > >
> > > > 100. FailedReplicaRecord: Since this is reported by each broker
> > > > independently, perhaps we could use a more concise representation
> that
> > > has
> > > > a top level broker field, an array of topics, which has an array of
> > > > partitions.
> > > >
> > >
> > > The issue is that there is a size limit on the each record.  Putting
> all
> > > of the partitions of a log directory into a single record would
> probably
> > > break that in many cases.  Still, we can optimize a bit by having an
> array
> > > of partition IDs, since nearly all the time, we have more than one
> from the
> > > same topic.
> > >
> > > > 200. Sounds good. If we remove the broker-side fencing logic, do we
> plan
> > > to
> > > > still keep FENCED in broker state? Do we plan to expose the new
> states
> > > > through the existing BrokerState metric and if so, what are the
> values
> > > for
> > > > the new states?
> > > >
> > >
> > > No, we don't need FENCED any more.  I have removed it from the KIP.
> > >
> > > The new states are very similar to the current ones, actually.  There
> are
> > > no new states or removed ones.  The main change in the broker state
> machine
> > > is that the RECOVERING_FROM_UNCLEAN_SHUTDOWN state has been renamed to
> > > RECOVERY.  Also, unlike previously, the broker will always pass through
> > > RECOVERY (although it may only stay in this state for a very short
> amount
> > > of time).
> > >
> > > > 201. This may be fine too. Could we document what happens when the
> > > > broker.id/controller.id in metadata.properties don't match the
> broker
> > > > config when the broker starts up?
> > > >
> > >
> > > I added some documentation about this.
> > >
> > > > 204. There is still "The highest metadata offset which the broker
> has not
> > > > reached" referenced under BrokerRegistration.
> > > >
> > >
> > > It should be CurrentMetadataOffset.  Fixed.
> > >
> > > > 206. Is that separate step needed given KIP-516? With KIP-516 (
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-LeaderAndIsr
> > > ),
> > > > we don't need to wait for the topic data to be removed from all
> brokers
> > > > before removing the topic metadata. The combination of unmatching
> > > > topicId
> > > > or the missing topicId from the metadata is enough for the broker to
> > > > clean
> > > > up deleted topics asynchronously.
> > >
> > > It won't be needed once KIP-516 is adopted, but this hasn't been
> > > implemented yet.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Jun
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Dec 8, 2020 at 5:27 PM Colin McCabe <[email protected]>
> wrote:
> > > >
> > > > > On Thu, Dec 3, 2020, at 16:37, Jun Rao wrote:
> > > > > > Hi, Colin,
> > > > > >
> > > > > > Thanks for the updated KIP. A few more comments below.
> > > > > >
> > > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thanks again for the reviews.
> > > > >
> > > > > > 80.2 For deprecated configs, we need to include zookeeper.* and
> > > > > > broker.id.generation.enable.
> > > > > >
> > > > >
> > > > > Added.
> > > > >
> > > > > > 83.1 If a broker is down, does the controller keep the previously
> > > > > > registered broker epoch forever? If not, how long does the
> controller
> > > > > keep
> > > > > > it? What does the controller do when receiving a broker heartbeat
> > > request
> > > > > > with an unfound broker epoch?
> > > > > >
> > > > >
> > > > > Yes, the controller keeps the previous registration forever.
> > > > >
> > > > > Broker heartbeat requests with an incorrect broker epoch will be
> > > rejected
> > > > > with STALE_BROKER_EPOCH.
> > > > >
> > > > > > 100. Have you figured out if we need to add a new record type for
> > > > > reporting
> > > > > > partitions on failed disks?
> > > > > >
> > > > >
> > > > > I added FailedReplicaRecord to reflect the case where a JBOD
> directory
> > > has
> > > > > failed, leading to failed replicas.
> > > > >
> > > > > > 102. For debugging purposes, sometimes it's useful to read the
> > > metadata
> > > > > > topic using tools like console-consumer. Should we support that
> and
> > > if
> > > > > so,
> > > > > > how?
> > > > > >
> > > > >
> > > > > For now, we have the ability to read the metadata logs with the
> > > dump-logs
> > > > > tool.  I think we will come up with some other tools in the future
> as
> > > we
> > > > > get experience.
> > > > >
> > > > > > 200. "brokers which are fenced will not appear in
> MetadataResponses.
> > > The
> > > > > > broker will not respond to these requests-- instead, it will
> simply
> > > > > > disconnect." If the controller is partitioned off from the
> brokers,
> > > this
> > > > > > design will cause every broker to stop accepting new client
> > > requests. In
> > > > > > contrast, if ZK is partitioned off, the existing behavior is
> that the
> > > > > > brokers can continue to work based on the last known metadata.
> So, I
> > > am
> > > > > not
> > > > > > sure if we should change the existing behavior because of the
> bigger
> > > > > impact
> > > > > > in the new one. Another option is to keep the existing behavior
> and
> > > > > expose
> > > > > > a metric for fenced brokers so that the operator could be
> alerted.
> > > > > >
> > > > >
> > > > > I'm skeptical about how well running without ZK currently works.
> > > However,
> > > > > I will move the broker-side fencing into a follow-up KIP.  This
> KIP is
> > > > > already pretty large and there is no hard dependency on this.
> There
> > > may
> > > > > also be other ways of accomplishing the positive effects of what
> > > > > broker-side fencing, so more discussion is needed.
> > > > >
> > > > > > 201. I read Ron's comment, but I am still not sure the benefit of
> > > keeping
> > > > > > broker.id and controller.id in meta.properties. It seems that
> we are
> > > > > just
> > > > > > duplicating the same info in two places and have the additional
> > > burden of
> > > > > > making sure the values in the two places are consistent.
> > > > > >
> > > > >
> > > > > I think the reasoning is that having broker.id protects us against
> > > > > accidentally bringing up a broker with a disk from a different
> > > broker.  I
> > > > > don't feel strongly about this but it seemed simpler to keep it.
> > > > >
> > > > > > 202. controller.connect.security.protocol: Is this needed since
> > > > > > controller.listener.names and listener.security.protocol.map
> imply
> > > the
> > > > > > security protocol already?
> > > > > >
> > > > >
> > > > > You're right, this isn't needed.  I'll remove it.
> > > > >
> > > > > > 203. registration.heartbeat.interval.ms: It defaults to 2k. ZK
> uses
> > > 1/3
> > > > > of
> > > > > > the session timeout for heartbeat. So, given the default 18k for
> > > > > > registration.lease.timeout.ms, should we default
> > > > > > registration.heartbeat.interval.ms to 6k?
> > > > > >
> > > > >
> > > > > 6 seconds seems like a pretty long time between heartbeats.  It
> might
> > > be
> > > > > useful to know when a broker is missing heartbeats, with less time
> than
> > > > > that.  I provisionally set it to 3 seconds (we can always change
> > > later...)
> > > > >
> > > > > I also changed the name of these configurations to "
> > > > > broker.heartbeat.interval.ms" and "broker.registration.timeout.ms"
> to
> > > try
> > > > > to clarify them a bit.
> > > > >
> > > > > > 204. "The highest metadata offset which the broker has not
> reached."
> > > It
> > > > > > seems this should be "has reached".
> > > > > >
> > > > >
> > > > > I changed this to "one more than the highest metadata offset which
> the
> > > > > broker has reached."
> > > > >
> > > > > > 205. UnfenceBrokerRecord and UnregisterBrokerRecord: To me, they
> > > seem to
> > > > > be
> > > > > > the same. Do we need both?
> > > > > >
> > > > >
> > > > > Unregistration means that the broker has been removed from the
> cluster.
> > > > > That is different than unfencing, which marks the broker as active.
> > > > >
> > > > > > 206. TopicRecord: The Deleting field is used to indicate that the
> > > topic
> > > > > is
> > > > > > being deleted. I am wondering if this is really needed since
> > > RemoveTopic
> > > > > > already indicates the same thing.
> > > > > >
> > > > >
> > > > > RemoveTopic is the last step, that scrubs all metadata about the
> topic.
> > > > > In order to get to that last step, the topic data needs to removed
> > > from all
> > > > > brokers (after each broker notices that the topic is being
> deleted).
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Wed, Dec 2, 2020 at 2:50 PM Colin McCabe <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > On Wed, Dec 2, 2020, at 14:07, Ron Dagostino wrote:
> > > > > > > > Hi Colin.  Thanks for the updates.  It's now clear to me that
> > > brokers
> > > > > > > > keep their broker epoch for the life of their JVM -- they
> > > register
> > > > > > > > once, get their broker epoch in the response, and then never
> > > > > > > > re-register again.  Brokers may get fenced, but they keep the
> > > same
> > > > > > > > broker epoch for the life of their JVM.  The incarnation ID
> is
> > > also
> > > > > > > > kept for the life of the JVM but is generated by the broker
> > > itself
> > > > > > > > upon startup, and the combination of the two allows the
> > > Controller to
> > > > > > > > act idempotently if any previously-sent registration response
> > > gets
> > > > > > > > lost.  Makes sense.
> > > > > > > >
> > > > > > >
> > > > > > > Thanks, Ron.  That's a good summary.
> > > > > > >
> > > > > > > > One thing I wonder about is if it might be helpful for the
> > > broker to
> > > > > > > > send the Cluster ID as determined from its meta.properties
> file
> > > in
> > > > > its
> > > > > > > > registration request.  Does it even make sense for the
> broker to
> > > > > > > > successfully register and enter the Fenced state if it has
> the
> > > wrong
> > > > > > > > Cluster ID?
> > > > > > >
> > > > > > > Yeah, that's a good idea.  Let's have the broker pass its
> cluster
> > > ID in
> > > > > > > the registration RPC, and then registration can fail if the
> broker
> > > is
> > > > > > > configured for the wrong cluster.
> > > > > > >
> > > > > > > >  The nextMetadatOffset value that the broker communicates
> > > > > > > > in its registration request only has meaning within the
> correct
> > > > > > > > cluster, so it feels to me that the Controller should have
> some
> > > way
> > > > > to
> > > > > > > > perform this sanity check.  There is currently (pre-KIP 500)
> a
> > > check
> > > > > > > > in the broker to make sure its configured cluster ID matches
> the
> > > one
> > > > > > > > stored in ZooKeeper, and we will have to perform this
> validation
> > > > > > > > somewhere in the KIP-500 world.  If the Controller doesn't
> do it
> > > > > > > > within the registration request then the broker will have to
> > > make a
> > > > > > > > metadata request to the Controller, retrieve the Cluster ID,
> and
> > > > > > > > perform the check itself.  It feels to me that it might be
> > > better for
> > > > > > > > the Controller to just do it, and then the broker doesn't
> have to
> > > > > > > > worry about it anymore once it successfully registers.
> > > > > > > >
> > > > > > > > I also have a question about the broker.id value and
> > > > > meta.properties.
> > > > > > > > The KIP now says "In version 0 of meta.properties, there is a
> > > > > > > > broker.id field.  Version 1 does not have this field.  It
> is no
> > > > > longer
> > > > > > > > needed because we no longer support dynamic broker id
> > > assignment."
> > > > > > > > But then there is an example version 1 meta.properties file
> that
> > > > > shows
> > > > > > > > the broker.id value.  I actually wonder if maybe the
> broker.id
> > > value
> > > > > > > > would be good to keep in the version 1 meta.properties file
> > > because
> > > > > it
> > > > > > > > currently (pre-KIP 500, version 0) acts as a sanity check to
> make
> > > > > sure
> > > > > > > > the broker is using the correct log directory.  Similarly
> with
> > > the
> > > > > > > > controller.id value on controllers -- it would allow the
> same
> > > type
> > > > > of
> > > > > > > > sanity check for quorum controllers.
> > > > > > > >
> > > > > > >
> > > > > > > That's a good point.  I will add broker.id back, and also add
> > > > > > > controller.id as a possibility.
> > > > > > >
> > > > > > > cheers,
> > > > > > > Colin
> > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Nov 30, 2020 at 7:41 PM Colin McCabe <
> [email protected]
> > > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Oct 23, 2020, at 16:10, Jun Rao wrote:
> > > > > > > > > > Hi, Colin,
> > > > > > > > > >
> > > > > > > > > > Thanks for the reply. A few more comments.
> > > > > > > > >
> > > > > > > > > Hi Jun,
> > > > > > > > >
> > > > > > > > > Thanks again for the reply.  Sorry for the long hiatus.  I
> was
> > > on
> > > > > > > vacation for a while.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 55. There is still text that favors new broker
> registration.
> > > > > "When a
> > > > > > > broker
> > > > > > > > > > first starts up, when it is in the INITIAL state, it will
> > > always
> > > > > > > "win"
> > > > > > > > > > broker ID conflicts.  However, once it is granted a
> lease, it
> > > > > > > transitions
> > > > > > > > > > out of the INITIAL state.  Thereafter, it may lose
> subsequent
> > > > > > > conflicts if
> > > > > > > > > > its broker epoch is stale.  (See KIP-380 for some
> background
> > > on
> > > > > > > broker
> > > > > > > > > > epoch.)  The reason for favoring new processes is to
> > > accommodate
> > > > > the
> > > > > > > common
> > > > > > > > > > case where a process is killed with kill -9 and then
> > > restarted.
> > > > > We
> > > > > > > want it
> > > > > > > > > > to be able to reclaim its old ID quickly in this case."
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks for the reminder.  I have clarified the language
> here.
> > > > > > > Hopefully now it is clear that we don't allow quick re-use of
> > > broker
> > > > > IDs.
> > > > > > > > >
> > > > > > > > > > 80.1 Sounds good. Could you document that listeners is a
> > > required
> > > > > > > config
> > > > > > > > > > now? It would also be useful to annotate other required
> > > configs.
> > > > > For
> > > > > > > > > > example, controller.connect should be required.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I added a note specifying that these are required.
> > > > > > > > >
> > > > > > > > > > 80.2 Could you list all deprecated existing configs?
> Another
> > > one
> > > > > is
> > > > > > > > > > control.plane.listener.name since the controller no
> longer
> > > sends
> > > > > > > > > > LeaderAndIsr, UpdateMetadata and StopReplica requests.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I added a section specifying some deprecated configs.
> > > > > > > > >
> > > > > > > > > > 83.1 It seems that the broker can transition from FENCED
> to
> > > > > RUNNING
> > > > > > > without
> > > > > > > > > > registering for a new broker epoch. I am not sure how
> this
> > > works.
> > > > > > > Once the
> > > > > > > > > > controller fences a broker, there is no need for the
> > > controller
> > > > > to
> > > > > > > keep the
> > > > > > > > > > boker epoch around. So, if the fenced broker's heartbeat
> > > request
> > > > > > > with the
> > > > > > > > > > existing broker epoch will be rejected, leading the
> broker
> > > back
> > > > > to
> > > > > > > the
> > > > > > > > > > FENCED state again.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The broker epoch refers to the broker registration.  So we
> DO
> > > keep
> > > > > the
> > > > > > > broker epoch around even while the broker is fenced.
> > > > > > > > >
> > > > > > > > > The broker epoch changes only when there is a new broker
> > > > > > > registration.  Fencing or unfencing the broker doesn't change
> the
> > > > > broker
> > > > > > > epoch.
> > > > > > > > >
> > > > > > > > > > 83.5 Good point on KIP-590. Then should we expose the
> > > controller
> > > > > for
> > > > > > > > > > debugging purposes? If not, we should deprecate the
> > > controllerID
> > > > > > > field in
> > > > > > > > > > MetadataResponse?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I think it's OK to expose it for now, with the proviso
> that it
> > > > > won't
> > > > > > > be reachable by clients.
> > > > > > > > >
> > > > > > > > > > 90. We rejected the shared ID with just one reason "This
> is
> > > not a
> > > > > > > good idea
> > > > > > > > > > because NetworkClient assumes a single ID space.  So if
> > > there is
> > > > > > > both a
> > > > > > > > > > controller 1 and a broker 1, we don't have a way of
> picking
> > > the
> > > > > > > "right"
> > > > > > > > > > one." This doesn't seem to be a strong reason. For
> example,
> > > we
> > > > > could
> > > > > > > > > > address the NetworkClient issue with the node type as you
> > > pointed
> > > > > > > out or
> > > > > > > > > > using the negative value of a broker ID as the
> controller ID.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > It would require a lot of code changes to support multiple
> > > types of
> > > > > > > node IDs.  It's not clear to me that the end result would be
> > > better --
> > > > > I
> > > > > > > tend to think it would be worse, since it would be more
> complex.
> > > In a
> > > > > > > similar vein, using negative numbers seems dangerous, since we
> use
> > > > > > > negatives or -1 as "special values" in many places.  For
> example,
> > > -1
> > > > > often
> > > > > > > represents "no such node."
> > > > > > > > >
> > > > > > > > > One important thing to keep in mind is that we want to be
> able
> > > to
> > > > > > > transition from a broker and a controller being co-located to
> them
> > > no
> > > > > > > longer being co-located.  This is much easier to do when they
> have
> > > > > separate
> > > > > > > IDs.
> > > > > > > > >
> > > > > > > > > > 100. In KIP-589
> > > > > > > > > > <
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > > > > >,
> > > > > > > > > > the broker reports all offline replicas due to a disk
> > > failure to
> > > > > the
> > > > > > > > > > controller. It seems this information needs to be
> persisted
> > > to
> > > > > the
> > > > > > > > > > metadata
> > > > > > > > > > log. Do we have a corresponding record for that?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hmm, I have to look into this a little bit more.  We may
> need
> > > a new
> > > > > > > record type.
> > > > > > > > >
> > > > > > > > > > 101. Currently, StopReplica request has 2 modes, without
> > > deletion
> > > > > > > and with
> > > > > > > > > > deletion. The former is used for controlled shutdown and
> > > handling
> > > > > > > disk
> > > > > > > > > > failure, and causes the follower to stop. The latter is
> for
> > > topic
> > > > > > > deletion
> > > > > > > > > > and partition reassignment, and causes the replica to be
> > > deleted.
> > > > > > > Since we
> > > > > > > > > > are deprecating StopReplica, could we document what
> triggers
> > > the
> > > > > > > stopping
> > > > > > > > > > of a follower and the deleting of a replica now?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > RemoveTopic triggers deletion.  In general the
> functionality of
> > > > > > > StopReplica is subsumed by the metadata records.
> > > > > > > > >
> > > > > > > > > > 102. Should we include the metadata topic in the
> > > > > MetadataResponse?
> > > > > > > If so,
> > > > > > > > > > when it will be included and what will the metadata
> response
> > > look
> > > > > > > like?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > No, it won't be included in the metadata response sent back
> > > from
> > > > > the
> > > > > > > brokers.
> > > > > > > > >
> > > > > > > > > > 103. "The active controller assigns the broker a new
> broker
> > > > > epoch,
> > > > > > > based on
> > > > > > > > > > the latest committed offset in the log." This seems
> > > inaccurate
> > > > > since
> > > > > > > the
> > > > > > > > > > latest committed offset doesn't always advance on every
> log
> > > > > append.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Given that the new broker epoch won't be visible until the
> > > commit
> > > > > has
> > > > > > > happened, I have changed this to "the next available offset in
> the
> > > log"
> > > > > > > > >
> > > > > > > > > > 104. REGISTERING(1) : It says "Otherwise, the broker
> moves
> > > into
> > > > > the
> > > > > > > FENCED
> > > > > > > > > > state.". It seems this should be RUNNING?
> > > > > > > > > >
> > > > > > > > > > 105. RUNNING: Should we require the broker to catch up
> to the
> > > > > > > metadata log
> > > > > > > > > > to get into this state?
> > > > > > > > >
> > > > > > > > > For 104 and 105, these sections have been reworked.
> > > > > > > > >
> > > > > > > > > best,
> > > > > > > > > Colin
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Oct 23, 2020 at 1:20 PM Colin McCabe <
> > > [email protected]
> > > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 21, 2020, at 05:51, Tom Bentley wrote:
> > > > > > > > > > > > Hi Colin,
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote:
> > > > > > > > > > > > > > Hi Colin.  Thanks for the hard work on this KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have some questions about what happens to a
> broker
> > > > > when it
> > > > > > > becomes
> > > > > > > > > > > > > > fenced (e.g. because it can't send a heartbeat
> > > request to
> > > > > > > keep its
> > > > > > > > > > > > > > lease).  The KIP says "When a broker is fenced,
> it
> > > cannot
> > > > > > > process any
> > > > > > > > > > > > > > client requests.  This prevents brokers which
> are not
> > > > > > > receiving
> > > > > > > > > > > > > > metadata updates or that are not receiving and
> > > processing
> > > > > > > them fast
> > > > > > > > > > > > > > enough from causing issues to clients." And in
> the
> > > > > > > description of the
> > > > > > > > > > > > > > FENCED(4) state it likewise says "While in this
> > > state,
> > > > > the
> > > > > > > broker
> > > > > > > > > > > does
> > > > > > > > > > > > > > not respond to client requests."  It makes sense
> > > that a
> > > > > > > fenced broker
> > > > > > > > > > > > > > should not accept producer requests -- I assume
> any
> > > such
> > > > > > > requests
> > > > > > > > > > > > > > would result in NotLeaderOrFollowerException.
> But
> > > what
> > > > > > > about KIP-392
> > > > > > > > > > > > > > (fetch from follower) consumer requests?  It is
> > > > > conceivable
> > > > > > > that
> > > > > > > > > > > these
> > > > > > > > > > > > > > could continue.  Related to that, would a fenced
> > > broker
> > > > > > > continue to
> > > > > > > > > > > > > > fetch data for partitions where it thinks it is a
> > > > > follower?
> > > > > > > Even if
> > > > > > > > > > > > > > it rejects consumer requests it might still
> continue
> > > to
> > > > > > > fetch as a
> > > > > > > > > > > > > > follower.  Might it be helpful to clarify both
> > > decisions
> > > > > > > here?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Ron,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Good question.  I think a fenced broker should
> > > continue to
> > > > > > > fetch on
> > > > > > > > > > > > > partitions it was already fetching before it was
> > > fenced,
> > > > > > > unless it
> > > > > > > > > > > hits a
> > > > > > > > > > > > > problem.  At that point it won't be able to
> continue,
> > > > > since it
> > > > > > > doesn't
> > > > > > > > > > > have
> > > > > > > > > > > > > the new metadata.  For example, it won't know about
> > > > > leadership
> > > > > > > changes
> > > > > > > > > > > in
> > > > > > > > > > > > > the partitions it's fetching.  The rationale for
> > > > > continuing to
> > > > > > > fetch
> > > > > > > > > > > is to
> > > > > > > > > > > > > try to avoid disruptions as much as possible.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't think fenced brokers should accept client
> > > requests.
> > > > > > > The issue
> > > > > > > > > > > is
> > > > > > > > > > > > > that the fenced broker may or may not have any
> data it
> > > is
> > > > > > > supposed to
> > > > > > > > > > > > > have.  It may or may not have applied any
> configuration
> > > > > > > changes, etc.
> > > > > > > > > > > that
> > > > > > > > > > > > > it is supposed to have applied.  So it could get
> pretty
> > > > > > > confusing, and
> > > > > > > > > > > also
> > > > > > > > > > > > > potentially waste the client's time.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > When fenced, how would the broker reply to a client
> > > which did
> > > > > > > make a
> > > > > > > > > > > > request?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi Tom,
> > > > > > > > > > >
> > > > > > > > > > > The broker will respond with a retryable error in that
> > > case.
> > > > > Once
> > > > > > > the
> > > > > > > > > > > client has re-fetched its metadata, it will no longer
> see
> > > the
> > > > > > > fenced broker
> > > > > > > > > > > as part of the cluster.  I added a note to the KIP.
> > > > > > > > > > >
> > > > > > > > > > > best,
> > > > > > > > > > > Colin
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Tom
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Reply via email to