Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Jun Rao Wed, 16 Dec 2020 13:40:29 -0800

Hi, Colin,

Thanks for the reply. A few follow up comments.


211. When does the broker send the BrokerRegistration request to the
controller? Is it after the recovery phase? If so, at that point, the
broker has already caught up on the metadata (in order to clean up deleted
partitions). Then, it seems we don't need the ShouldFence field
in BrokerHeartbeatRequest?

213. kafka-cluster.sh
213.1 For the decommision example, should the command take a broker id?
213.2 Existing tools use the "--" command line option (e.g. kafka-topics
--list --topic test). Should we follow the same convention
for kafka-cluster.sh (and kafka-storage.sh)?
213.3 Should we add an admin api for broker decommission so that this can
be done programmatically?

220. DecommissionBroker: "NOT_CONTROLLER if the node that the request was
sent to is not the controller". I thought the clients never send a request
to the controller directly now and the broker will forward it to the
controller?

221. Could we add the required ACL for the new requests?

Jun

On Wed, Dec 16, 2020 at 11:38 AM Colin McCabe <[email protected]> wrote:

> On Wed, Dec 16, 2020, at 09:59, Jun Rao wrote:
> > Hi, Colin,
> >
> > Thanks for the reply. A few more comments below.
> >
>
> Hi Jun,
>
> Thanks for the comments.
>
> > 206. "RemoveTopic is the last step, that scrubs all metadata about the
> > topic.  In order to get to that last step, the topic data needs to
> removed
> > from all brokers (after each broker notices that the topic is being
> > deleted)." Currently, this is done based on the response of
> > StopReplicaRequest. Since the controller no longer sends this request,
> how
> > does the controller know that the data for the deleted topic has
> > been removed in the brokers?
> >
>
> That's a good point... definitely an oversight.
>
> It seems complex to force the controller to track when log directories
> have been deleted.  Let's just assume that KIP-516 has been implemented,
> and track them by UUID.  Then we can just have a single topic deletion
> record.
>
> I added a section on "topic identifiers" describing this.
>
> > 210. Thanks for the explanation. Sounds good to me.
> >
> > 211. I still don't see an example when ShouldFence is set to true in
> > BrokerHeartbeatReques. Could we add one?
> >
>
> It is sent to true when the broker is first starting up and doesn't yet
> want to be unfenced.  I added a longer explanation of this in the "Broker
> Leases" section.
>
> > 213. The KIP now allows replicas to be assigned on brokers that are
> fenced,
> > which is an improvement. How do we permanently remove a broker (e.g.
> > cluster shrinking) to prevent it from being used for future replica
> > assignments?
> >
>
> This is a fair point.  I will create a kafka-cluster.sh script which can
> do this, plus a DecommissionBrokerRequest.
>
> As a bonus the kafka-cluster.sh script can help people find the cluster ID
> of brokers, something that people have wanted a tool for in the past.
>
> > 214. "Currently, when a node is down, all of its ZK registration
> > information is gone.  But  we need this information in order to
> understand
> > things like whether the replicas of a particular partition are
> > well-balanced across racks." This is not quite true. Currently, even when
> > ZK registration is gone, the existing replica assignment is still
> available
> > in the metadata response.
> >
>
> I agree that the existing replica assignment is still available.  But that
> just tells you partition X is on nodes A, B, and C.  If you don't have the
> ZK registration for one or more of A, B, or C then you don't know whether
> we are following the policy of "two replicas on one rack, one replica on
> another."  Or any other more complex rack placement policy that you might
> have.
>
> best,
> Colin
>
> > Jun
> >
> > On Wed, Dec 16, 2020 at 8:48 AM Colin McCabe <[email protected]> wrote:
> >
> > > On Tue, Dec 15, 2020, at 13:08, Jun Rao wrote:
> > > > Hi, Colin,
> > > >
> > > > Thanks for the reply. A few more follow up comments.
> > > >
> > > > 210. initial.broker.registration.timeout.ms: The default value is
> 90sec,
> > > > which seems long. If a broker fails the registration because of
> incorrect
> > > > configs, we want to fail faster. In comparison, the defaults for
> > > > zookeeper.connection.timeout.ms is 18 secs.
> > > >
> > >
> > > Hi Jun,
> > >
> > > I agree that the initial connection timeout here is longer than what we
> > > had with ZK.  The main reason I selected a slightly longer timeout
> here is
> > > to handle the case where the controllers and the brokers are
> co-located.
> > > For example, if you have a 3 node cluster and all three nodes are
> > > controllers+brokers, when you first bring up the cluster, we will have
> to
> > > stand up the controller quorum and then handle broker registrations.
> > > Although we believe Raft will start up relatively quickly, it's good to
> > > leave some extra margin here.
> > >
> > > I don't think there's a big disadvantage to having a slightly longer
> > > timeout here.  After all, starting up the brokers with no controllers
> is an
> > > admin mistake, which we don't expect to see very often.  Maybe let's
> set it
> > > to 60 seconds for now and maybe see if we can tweak it in the future if
> > > that turns out to be too long or short?
> > >
> > > > 211. "Once they are all moved, the controller responds to the
> heartbeat
> > > > with a nextState of SHUTDOWN." It seems that nextState is no longer
> > > > relevant. Also, could you add a bit more explanation on when
> ShouldFence
> > > is
> > > > set to true in BrokerHeartbeatRequest?
> > > >
> > >
> > > Good catch.  I removed the obsolete section referring to nextState and
> > > added a reference to the new boolean.  I also added some more
> information
> > > about ShouldFence and about the rationale for separating fencing from
> > > registration.
> > >
> > > > 212. Related to the above, what does the broker do if IsFenced is
> true in
> > > > the BrokerHeartbeatResponse? Will the broker do the same thing on
> > > receiving
> > > > a FenceBrokerRecord from the metadata log?
> > > >
> > >
> > > The broker only checks this boolean during the startup process.  After
> the
> > > startup process is finished, it ignores this boolean.
> > >
> > > The broker uses fence / unfence records in the metadata log to
> determine
> > > which brokers should appear in its MetadataResponse.
> > >
> > > best,
> > > Colin
> > >
> > > > Jun
> > > >
> > > > On Tue, Dec 15, 2020 at 8:51 AM Colin McCabe <[email protected]>
> wrote:
> > > >
> > > > > On Tue, Dec 15, 2020, at 04:13, Tom Bentley wrote:
> > > > > > Hi Colin,
> > > > > >
> > > > > > The KIP says that "brokers which are fenced will not appear in
> > > > > > MetadataResponses.  So clients that have up-to-date metadata will
> > > not try
> > > > > > to contact fenced brokers.", which is fine, but it doesn't seem
> to
> > > > > > elaborate on what happens for clients which try to contact a
> broker
> > > > > (using
> > > > > > stale metadata, for example) and find it in the
> registered-but-fenced
> > > > > > state.
> > > > >
> > > > > Hi Tom,
> > > > >
> > > > > I have dropped the broker-side fencing from this proposal.  So now
> the
> > > > > fencing is basically the same as today's fencing: it's
> controller-side
> > > > > only.  That means that clients using stale metadata can contact
> fenced
> > > > > brokers and communicate with them.  The only case where the broker
> > > will be
> > > > > inaccessible is when it hasn't finished starting yet (i.e., has
> not yet
> > > > > attained RUNNING state.)
> > > > >
> > > > > Just like today, we expect the safeguards built into the
> replication
> > > > > protocol to prevent the worst corner cases that could result from
> > > this.  I
> > > > > do think we will probably take up this issue later and find a
> better
> > > > > solution for client-side metadata, but this KIP is big enough
> as-is.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > > You said previously that the broker will respond with a retriable
> > > > > > error, which is again fine, but you didn't answer my question
> about
> > > which
> > > > > > error code you've chosen for this. I realise that this doesn't
> really
> > > > > > affect the correctness of the behaviour, but I'm not aware of any
> > > > > existing
> > > > > > error code which is a good fit. So it would be good to understand
> > > about
> > > > > how
> > > > > > you're making this backward compatible for clients.
> > > > > >
> > > > > > Many thanks,
> > > > > >
> > > > > > Tom
> > > > > >
> > > > > > On Tue, Dec 15, 2020 at 1:42 AM Colin McCabe <[email protected]
> >
> > > wrote:
> > > > > >
> > > > > > > On Fri, Dec 11, 2020, at 17:07, Jun Rao wrote:
> > > > > > > > Hi, Colin,
> > > > > > > >
> > > > > > > > Thanks for the reply. Just a couple of more comments below.
> > > > > > > >
> > > > > > > > 210. Since we are deprecating
> zookeeper.connection.timeout.ms,
> > > > > should we
> > > > > > > > add a new config to bound the time for a broker to connect
> to the
> > > > > > > > controller during starting up?
> > > > > > > >
> > > > > > >
> > > > > > > Good idea.  I added initial.broker.registration.timeout.ms for
> > > this.
> > > > > > >
> > > > > > > > 211. BrokerHeartbeat no longer has the state field in the
> > > > > > > request/response.
> > > > > > > > However, (a) the controller shutdown section still has "In
> its
> > > > > periodic
> > > > > > > > heartbeats, the broker asks the controller if it can
> transition
> > > into
> > > > > the
> > > > > > > > SHUTDOWN state.  This motivates the controller to move all
> of the
> > > > > leaders
> > > > > > > > off of that broker.  Once they are all moved, the controller
> > > > > responds to
> > > > > > > > the heartbeat with a nextState of SHUTDOWN."; (2) the
> > > description of
> > > > > > > > BrokerHeartbeat still references currentState and
> targetState.
> > > > > > > >
> > > > > > >
> > > > > > > Thanks.  I've made these sections clearer and removed the
> obsolete
> > > > > > > references to sending states.
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Dec 11, 2020 at 1:33 PM Colin McCabe <
> [email protected]
> > > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > On Wed, Dec 9, 2020, at 10:10, Jun Rao wrote:
> > > > > > > > > > Hi, Colin,
> > > > > > > > > >
> > > > > > > > > > Thanks for the update. A few more follow up comments.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi Jun,
> > > > > > > > >
> > > > > > > > > Thanks again for the review.
> > > > > > > > >
> > > > > > > > > > 100. FailedReplicaRecord: Since this is reported by each
> > > broker
> > > > > > > > > > independently, perhaps we could use a more concise
> > > representation
> > > > > > > that
> > > > > > > > > has
> > > > > > > > > > a top level broker field, an array of topics, which has
> an
> > > array
> > > > > of
> > > > > > > > > > partitions.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The issue is that there is a size limit on the each record.
> > > > > Putting
> > > > > > > all
> > > > > > > > > of the partitions of a log directory into a single record
> would
> > > > > > > probably
> > > > > > > > > break that in many cases.  Still, we can optimize a bit by
> > > having
> > > > > an
> > > > > > > array
> > > > > > > > > of partition IDs, since nearly all the time, we have more
> than
> > > one
> > > > > > > from the
> > > > > > > > > same topic.
> > > > > > > > >
> > > > > > > > > > 200. Sounds good. If we remove the broker-side fencing
> > > logic, do
> > > > > we
> > > > > > > plan
> > > > > > > > > to
> > > > > > > > > > still keep FENCED in broker state? Do we plan to expose
> the
> > > new
> > > > > > > states
> > > > > > > > > > through the existing BrokerState metric and if so, what
> are
> > > the
> > > > > > > values
> > > > > > > > > for
> > > > > > > > > > the new states?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > No, we don't need FENCED any more.  I have removed it from
> the
> > > KIP.
> > > > > > > > >
> > > > > > > > > The new states are very similar to the current ones,
> actually.
> > > > > There
> > > > > > > are
> > > > > > > > > no new states or removed ones.  The main change in the
> broker
> > > state
> > > > > > > machine
> > > > > > > > > is that the RECOVERING_FROM_UNCLEAN_SHUTDOWN state has been
> > > > > renamed to
> > > > > > > > > RECOVERY.  Also, unlike previously, the broker will always
> pass
> > > > > through
> > > > > > > > > RECOVERY (although it may only stay in this state for a
> very
> > > short
> > > > > > > amount
> > > > > > > > > of time).
> > > > > > > > >
> > > > > > > > > > 201. This may be fine too. Could we document what happens
> > > when
> > > > > the
> > > > > > > > > > broker.id/controller.id in metadata.properties don't
> match
> > > the
> > > > > > > broker
> > > > > > > > > > config when the broker starts up?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I added some documentation about this.
> > > > > > > > >
> > > > > > > > > > 204. There is still "The highest metadata offset which
> the
> > > broker
> > > > > > > has not
> > > > > > > > > > reached" referenced under BrokerRegistration.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > It should be CurrentMetadataOffset.  Fixed.
> > > > > > > > >
> > > > > > > > > > 206. Is that separate step needed given KIP-516? With
> > > KIP-516 (
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-LeaderAndIsr
> > > > > > > > > ),
> > > > > > > > > > we don't need to wait for the topic data to be removed
> from
> > > all
> > > > > > > brokers
> > > > > > > > > > before removing the topic metadata. The combination of
> > > unmatching
> > > > > > > > > > topicId
> > > > > > > > > > or the missing topicId from the metadata is enough for
> the
> > > > > broker to
> > > > > > > > > > clean
> > > > > > > > > > up deleted topics asynchronously.
> > > > > > > > >
> > > > > > > > > It won't be needed once KIP-516 is adopted, but this hasn't
> > > been
> > > > > > > > > implemented yet.
> > > > > > > > >
> > > > > > > > > best,
> > > > > > > > > Colin
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Dec 8, 2020 at 5:27 PM Colin McCabe <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > On Thu, Dec 3, 2020, at 16:37, Jun Rao wrote:
> > > > > > > > > > > > Hi, Colin,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the updated KIP. A few more comments
> below.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi Jun,
> > > > > > > > > > >
> > > > > > > > > > > Thanks again for the reviews.
> > > > > > > > > > >
> > > > > > > > > > > > 80.2 For deprecated configs, we need to include
> > > zookeeper.*
> > > > > and
> > > > > > > > > > > > broker.id.generation.enable.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Added.
> > > > > > > > > > >
> > > > > > > > > > > > 83.1 If a broker is down, does the controller keep
> the
> > > > > previously
> > > > > > > > > > > > registered broker epoch forever? If not, how long
> does
> > > the
> > > > > > > controller
> > > > > > > > > > > keep
> > > > > > > > > > > > it? What does the controller do when receiving a
> broker
> > > > > heartbeat
> > > > > > > > > request
> > > > > > > > > > > > with an unfound broker epoch?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yes, the controller keeps the previous registration
> > > forever.
> > > > > > > > > > >
> > > > > > > > > > > Broker heartbeat requests with an incorrect broker
> epoch
> > > will
> > > > > be
> > > > > > > > > rejected
> > > > > > > > > > > with STALE_BROKER_EPOCH.
> > > > > > > > > > >
> > > > > > > > > > > > 100. Have you figured out if we need to add a new
> record
> > > > > type for
> > > > > > > > > > > reporting
> > > > > > > > > > > > partitions on failed disks?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I added FailedReplicaRecord to reflect the case where a
> > > JBOD
> > > > > > > directory
> > > > > > > > > has
> > > > > > > > > > > failed, leading to failed replicas.
> > > > > > > > > > >
> > > > > > > > > > > > 102. For debugging purposes, sometimes it's useful to
> > > read
> > > > > the
> > > > > > > > > metadata
> > > > > > > > > > > > topic using tools like console-consumer. Should we
> > > support
> > > > > that
> > > > > > > and
> > > > > > > > > if
> > > > > > > > > > > so,
> > > > > > > > > > > > how?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > For now, we have the ability to read the metadata logs
> > > with the
> > > > > > > > > dump-logs
> > > > > > > > > > > tool.  I think we will come up with some other tools
> in the
> > > > > future
> > > > > > > as
> > > > > > > > > we
> > > > > > > > > > > get experience.
> > > > > > > > > > >
> > > > > > > > > > > > 200. "brokers which are fenced will not appear in
> > > > > > > MetadataResponses.
> > > > > > > > > The
> > > > > > > > > > > > broker will not respond to these requests-- instead,
> it
> > > will
> > > > > > > simply
> > > > > > > > > > > > disconnect." If the controller is partitioned off
> from
> > > the
> > > > > > > brokers,
> > > > > > > > > this
> > > > > > > > > > > > design will cause every broker to stop accepting new
> > > client
> > > > > > > > > requests. In
> > > > > > > > > > > > contrast, if ZK is partitioned off, the existing
> > > behavior is
> > > > > > > that the
> > > > > > > > > > > > brokers can continue to work based on the last known
> > > > > metadata.
> > > > > > > So, I
> > > > > > > > > am
> > > > > > > > > > > not
> > > > > > > > > > > > sure if we should change the existing behavior
> because
> > > of the
> > > > > > > bigger
> > > > > > > > > > > impact
> > > > > > > > > > > > in the new one. Another option is to keep the
> existing
> > > > > behavior
> > > > > > > and
> > > > > > > > > > > expose
> > > > > > > > > > > > a metric for fenced brokers so that the operator
> could be
> > > > > > > alerted.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I'm skeptical about how well running without ZK
> currently
> > > > > works.
> > > > > > > > > However,
> > > > > > > > > > > I will move the broker-side fencing into a follow-up
> KIP.
> > > This
> > > > > > > KIP is
> > > > > > > > > > > already pretty large and there is no hard dependency on
> > > this.
> > > > > > > There
> > > > > > > > > may
> > > > > > > > > > > also be other ways of accomplishing the positive
> effects of
> > > > > what
> > > > > > > > > > > broker-side fencing, so more discussion is needed.
> > > > > > > > > > >
> > > > > > > > > > > > 201. I read Ron's comment, but I am still not sure
> the
> > > > > benefit of
> > > > > > > > > keeping
> > > > > > > > > > > > broker.id and controller.id in meta.properties. It
> seems
> > > > > that
> > > > > > > we are
> > > > > > > > > > > just
> > > > > > > > > > > > duplicating the same info in two places and have the
> > > > > additional
> > > > > > > > > burden of
> > > > > > > > > > > > making sure the values in the two places are
> consistent.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think the reasoning is that having broker.id
> protects us
> > > > > against
> > > > > > > > > > > accidentally bringing up a broker with a disk from a
> > > different
> > > > > > > > > broker.  I
> > > > > > > > > > > don't feel strongly about this but it seemed simpler to
> > > keep
> > > > > it.
> > > > > > > > > > >
> > > > > > > > > > > > 202. controller.connect.security.protocol: Is this
> needed
> > > > > since
> > > > > > > > > > > > controller.listener.names and
> > > listener.security.protocol.map
> > > > > > > imply
> > > > > > > > > the
> > > > > > > > > > > > security protocol already?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > You're right, this isn't needed.  I'll remove it.
> > > > > > > > > > >
> > > > > > > > > > > > 203. registration.heartbeat.interval.ms: It
> defaults to
> > > 2k.
> > > > > ZK
> > > > > > > uses
> > > > > > > > > 1/3
> > > > > > > > > > > of
> > > > > > > > > > > > the session timeout for heartbeat. So, given the
> default
> > > 18k
> > > > > for
> > > > > > > > > > > > registration.lease.timeout.ms, should we default
> > > > > > > > > > > > registration.heartbeat.interval.ms to 6k?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 6 seconds seems like a pretty long time between
> > > heartbeats.  It
> > > > > > > might
> > > > > > > > > be
> > > > > > > > > > > useful to know when a broker is missing heartbeats,
> with
> > > less
> > > > > time
> > > > > > > than
> > > > > > > > > > > that.  I provisionally set it to 3 seconds (we can
> always
> > > > > change
> > > > > > > > > later...)
> > > > > > > > > > >
> > > > > > > > > > > I also changed the name of these configurations to "
> > > > > > > > > > > broker.heartbeat.interval.ms" and "
> > > > > broker.registration.timeout.ms"
> > > > > > > to
> > > > > > > > > try
> > > > > > > > > > > to clarify them a bit.
> > > > > > > > > > >
> > > > > > > > > > > > 204. "The highest metadata offset which the broker
> has
> > > not
> > > > > > > reached."
> > > > > > > > > It
> > > > > > > > > > > > seems this should be "has reached".
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I changed this to "one more than the highest metadata
> > > offset
> > > > > which
> > > > > > > the
> > > > > > > > > > > broker has reached."
> > > > > > > > > > >
> > > > > > > > > > > > 205. UnfenceBrokerRecord and UnregisterBrokerRecord:
> To
> > > me,
> > > > > they
> > > > > > > > > seem to
> > > > > > > > > > > be
> > > > > > > > > > > > the same. Do we need both?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Unregistration means that the broker has been removed
> from
> > > the
> > > > > > > cluster.
> > > > > > > > > > > That is different than unfencing, which marks the
> broker as
> > > > > active.
> > > > > > > > > > >
> > > > > > > > > > > > 206. TopicRecord: The Deleting field is used to
> indicate
> > > > > that the
> > > > > > > > > topic
> > > > > > > > > > > is
> > > > > > > > > > > > being deleted. I am wondering if this is really
> needed
> > > since
> > > > > > > > > RemoveTopic
> > > > > > > > > > > > already indicates the same thing.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > RemoveTopic is the last step, that scrubs all metadata
> > > about
> > > > > the
> > > > > > > topic.
> > > > > > > > > > > In order to get to that last step, the topic data
> needs to
> > > > > removed
> > > > > > > > > from all
> > > > > > > > > > > brokers (after each broker notices that the topic is
> being
> > > > > > > deleted).
> > > > > > > > > > >
> > > > > > > > > > > best,
> > > > > > > > > > > Colin
> > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Dec 2, 2020 at 2:50 PM Colin McCabe <
> > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Dec 2, 2020, at 14:07, Ron Dagostino wrote:
> > > > > > > > > > > > > > Hi Colin.  Thanks for the updates.  It's now
> clear
> > > to me
> > > > > that
> > > > > > > > > brokers
> > > > > > > > > > > > > > keep their broker epoch for the life of their
> JVM --
> > > they
> > > > > > > > > register
> > > > > > > > > > > > > > once, get their broker epoch in the response, and
> > > then
> > > > > never
> > > > > > > > > > > > > > re-register again.  Brokers may get fenced, but
> they
> > > > > keep the
> > > > > > > > > same
> > > > > > > > > > > > > > broker epoch for the life of their JVM.  The
> > > incarnation
> > > > > ID
> > > > > > > is
> > > > > > > > > also
> > > > > > > > > > > > > > kept for the life of the JVM but is generated by
> the
> > > > > broker
> > > > > > > > > itself
> > > > > > > > > > > > > > upon startup, and the combination of the two
> allows
> > > the
> > > > > > > > > Controller to
> > > > > > > > > > > > > > act idempotently if any previously-sent
> registration
> > > > > response
> > > > > > > > > gets
> > > > > > > > > > > > > > lost.  Makes sense.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks, Ron.  That's a good summary.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > One thing I wonder about is if it might be
> helpful
> > > for
> > > > > the
> > > > > > > > > broker to
> > > > > > > > > > > > > > send the Cluster ID as determined from its
> > > > > meta.properties
> > > > > > > file
> > > > > > > > > in
> > > > > > > > > > > its
> > > > > > > > > > > > > > registration request.  Does it even make sense
> for
> > > the
> > > > > > > broker to
> > > > > > > > > > > > > > successfully register and enter the Fenced state
> if
> > > it
> > > > > has
> > > > > > > the
> > > > > > > > > wrong
> > > > > > > > > > > > > > Cluster ID?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yeah, that's a good idea.  Let's have the broker
> pass
> > > its
> > > > > > > cluster
> > > > > > > > > ID in
> > > > > > > > > > > > > the registration RPC, and then registration can
> fail
> > > if the
> > > > > > > broker
> > > > > > > > > is
> > > > > > > > > > > > > configured for the wrong cluster.
> > > > > > > > > > > > >
> > > > > > > > > > > > > >  The nextMetadatOffset value that the broker
> > > communicates
> > > > > > > > > > > > > > in its registration request only has meaning
> within
> > > the
> > > > > > > correct
> > > > > > > > > > > > > > cluster, so it feels to me that the Controller
> should
> > > > > have
> > > > > > > some
> > > > > > > > > way
> > > > > > > > > > > to
> > > > > > > > > > > > > > perform this sanity check.  There is currently
> > > (pre-KIP
> > > > > 500)
> > > > > > > a
> > > > > > > > > check
> > > > > > > > > > > > > > in the broker to make sure its configured
> cluster ID
> > > > > matches
> > > > > > > the
> > > > > > > > > one
> > > > > > > > > > > > > > stored in ZooKeeper, and we will have to perform
> this
> > > > > > > validation
> > > > > > > > > > > > > > somewhere in the KIP-500 world.  If the
> Controller
> > > > > doesn't
> > > > > > > do it
> > > > > > > > > > > > > > within the registration request then the broker
> will
> > > > > have to
> > > > > > > > > make a
> > > > > > > > > > > > > > metadata request to the Controller, retrieve the
> > > Cluster
> > > > > ID,
> > > > > > > and
> > > > > > > > > > > > > > perform the check itself.  It feels to me that it
> > > might
> > > > > be
> > > > > > > > > better for
> > > > > > > > > > > > > > the Controller to just do it, and then the broker
> > > doesn't
> > > > > > > have to
> > > > > > > > > > > > > > worry about it anymore once it successfully
> > > registers.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I also have a question about the broker.id
> value and
> > > > > > > > > > > meta.properties.
> > > > > > > > > > > > > > The KIP now says "In version 0 of
> meta.properties,
> > > there
> > > > > is a
> > > > > > > > > > > > > > broker.id field.  Version 1 does not have this
> > > field.
> > > > > It
> > > > > > > is no
> > > > > > > > > > > longer
> > > > > > > > > > > > > > needed because we no longer support dynamic
> broker id
> > > > > > > > > assignment."
> > > > > > > > > > > > > > But then there is an example version 1
> > > meta.properties
> > > > > file
> > > > > > > that
> > > > > > > > > > > shows
> > > > > > > > > > > > > > the broker.id value.  I actually wonder if
> maybe the
> > > > > > > broker.id
> > > > > > > > > value
> > > > > > > > > > > > > > would be good to keep in the version 1
> > > meta.properties
> > > > > file
> > > > > > > > > because
> > > > > > > > > > > it
> > > > > > > > > > > > > > currently (pre-KIP 500, version 0) acts as a
> sanity
> > > > > check to
> > > > > > > make
> > > > > > > > > > > sure
> > > > > > > > > > > > > > the broker is using the correct log directory.
> > > Similarly
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > > > > > controller.id value on controllers -- it would
> > > allow the
> > > > > > > same
> > > > > > > > > type
> > > > > > > > > > > of
> > > > > > > > > > > > > > sanity check for quorum controllers.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > That's a good point.  I will add broker.id back,
> and
> > > also
> > > > > add
> > > > > > > > > > > > > controller.id as a possibility.
> > > > > > > > > > > > >
> > > > > > > > > > > > > cheers,
> > > > > > > > > > > > > Colin
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Nov 30, 2020 at 7:41 PM Colin McCabe <
> > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Oct 23, 2020, at 16:10, Jun Rao wrote:
> > > > > > > > > > > > > > > > Hi, Colin,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the reply. A few more comments.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks again for the reply.  Sorry for the long
> > > > > hiatus.  I
> > > > > > > was
> > > > > > > > > on
> > > > > > > > > > > > > vacation for a while.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 55. There is still text that favors new
> broker
> > > > > > > registration.
> > > > > > > > > > > "When a
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > first starts up, when it is in the INITIAL
> > > state, it
> > > > > will
> > > > > > > > > always
> > > > > > > > > > > > > "win"
> > > > > > > > > > > > > > > > broker ID conflicts.  However, once it is
> > > granted a
> > > > > > > lease, it
> > > > > > > > > > > > > transitions
> > > > > > > > > > > > > > > > out of the INITIAL state.  Thereafter, it may
> > > lose
> > > > > > > subsequent
> > > > > > > > > > > > > conflicts if
> > > > > > > > > > > > > > > > its broker epoch is stale.  (See KIP-380 for
> some
> > > > > > > background
> > > > > > > > > on
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > epoch.)  The reason for favoring new
> processes
> > > is to
> > > > > > > > > accommodate
> > > > > > > > > > > the
> > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > case where a process is killed with kill -9
> and
> > > then
> > > > > > > > > restarted.
> > > > > > > > > > > We
> > > > > > > > > > > > > want it
> > > > > > > > > > > > > > > > to be able to reclaim its old ID quickly in
> this
> > > > > case."
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the reminder.  I have clarified the
> > > language
> > > > > > > here.
> > > > > > > > > > > > > Hopefully now it is clear that we don't allow quick
> > > re-use
> > > > > of
> > > > > > > > > broker
> > > > > > > > > > > IDs.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 80.1 Sounds good. Could you document that
> > > listeners
> > > > > is a
> > > > > > > > > required
> > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > now? It would also be useful to annotate
> other
> > > > > required
> > > > > > > > > configs.
> > > > > > > > > > > For
> > > > > > > > > > > > > > > > example, controller.connect should be
> required.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I added a note specifying that these are
> required.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 80.2 Could you list all deprecated existing
> > > configs?
> > > > > > > Another
> > > > > > > > > one
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > control.plane.listener.name since the
> > > controller no
> > > > > > > longer
> > > > > > > > > sends
> > > > > > > > > > > > > > > > LeaderAndIsr, UpdateMetadata and StopReplica
> > > > > requests.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I added a section specifying some deprecated
> > > configs.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 83.1 It seems that the broker can transition
> from
> > > > > FENCED
> > > > > > > to
> > > > > > > > > > > RUNNING
> > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > registering for a new broker epoch. I am not
> > > sure how
> > > > > > > this
> > > > > > > > > works.
> > > > > > > > > > > > > Once the
> > > > > > > > > > > > > > > > controller fences a broker, there is no need
> for
> > > the
> > > > > > > > > controller
> > > > > > > > > > > to
> > > > > > > > > > > > > keep the
> > > > > > > > > > > > > > > > boker epoch around. So, if the fenced
> broker's
> > > > > heartbeat
> > > > > > > > > request
> > > > > > > > > > > > > with the
> > > > > > > > > > > > > > > > existing broker epoch will be rejected,
> leading
> > > the
> > > > > > > broker
> > > > > > > > > back
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > FENCED state again.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The broker epoch refers to the broker
> registration.
> > > > > So we
> > > > > > > DO
> > > > > > > > > keep
> > > > > > > > > > > the
> > > > > > > > > > > > > broker epoch around even while the broker is
> fenced.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The broker epoch changes only when there is a
> new
> > > > > broker
> > > > > > > > > > > > > registration.  Fencing or unfencing the broker
> doesn't
> > > > > change
> > > > > > > the
> > > > > > > > > > > broker
> > > > > > > > > > > > > epoch.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 83.5 Good point on KIP-590. Then should we
> > > expose the
> > > > > > > > > controller
> > > > > > > > > > > for
> > > > > > > > > > > > > > > > debugging purposes? If not, we should
> deprecate
> > > the
> > > > > > > > > controllerID
> > > > > > > > > > > > > field in
> > > > > > > > > > > > > > > > MetadataResponse?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think it's OK to expose it for now, with the
> > > proviso
> > > > > > > that it
> > > > > > > > > > > won't
> > > > > > > > > > > > > be reachable by clients.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 90. We rejected the shared ID with just one
> > > reason
> > > > > "This
> > > > > > > is
> > > > > > > > > not a
> > > > > > > > > > > > > good idea
> > > > > > > > > > > > > > > > because NetworkClient assumes a single ID
> > > space.  So
> > > > > if
> > > > > > > > > there is
> > > > > > > > > > > > > both a
> > > > > > > > > > > > > > > > controller 1 and a broker 1, we don't have a
> way
> > > of
> > > > > > > picking
> > > > > > > > > the
> > > > > > > > > > > > > "right"
> > > > > > > > > > > > > > > > one." This doesn't seem to be a strong
> reason.
> > > For
> > > > > > > example,
> > > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > > > > address the NetworkClient issue with the node
> > > type
> > > > > as you
> > > > > > > > > pointed
> > > > > > > > > > > > > out or
> > > > > > > > > > > > > > > > using the negative value of a broker ID as
> the
> > > > > > > controller ID.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It would require a lot of code changes to
> support
> > > > > multiple
> > > > > > > > > types of
> > > > > > > > > > > > > node IDs.  It's not clear to me that the end result
> > > would
> > > > > be
> > > > > > > > > better --
> > > > > > > > > > > I
> > > > > > > > > > > > > tend to think it would be worse, since it would be
> more
> > > > > > > complex.
> > > > > > > > > In a
> > > > > > > > > > > > > similar vein, using negative numbers seems
> dangerous,
> > > > > since we
> > > > > > > use
> > > > > > > > > > > > > negatives or -1 as "special values" in many places.
> > > For
> > > > > > > example,
> > > > > > > > > -1
> > > > > > > > > > > often
> > > > > > > > > > > > > represents "no such node."
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > One important thing to keep in mind is that we
> > > want to
> > > > > be
> > > > > > > able
> > > > > > > > > to
> > > > > > > > > > > > > transition from a broker and a controller being
> > > co-located
> > > > > to
> > > > > > > them
> > > > > > > > > no
> > > > > > > > > > > > > longer being co-located.  This is much easier to do
> > > when
> > > > > they
> > > > > > > have
> > > > > > > > > > > separate
> > > > > > > > > > > > > IDs.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 100. In KIP-589
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > > > > > > > > > > >,
> > > > > > > > > > > > > > > > the broker reports all offline replicas due
> to a
> > > disk
> > > > > > > > > failure to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > controller. It seems this information needs
> to be
> > > > > > > persisted
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > log. Do we have a corresponding record for
> that?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hmm, I have to look into this a little bit
> more.
> > > We
> > > > > may
> > > > > > > need
> > > > > > > > > a new
> > > > > > > > > > > > > record type.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 101. Currently, StopReplica request has 2
> modes,
> > > > > without
> > > > > > > > > deletion
> > > > > > > > > > > > > and with
> > > > > > > > > > > > > > > > deletion. The former is used for controlled
> > > shutdown
> > > > > and
> > > > > > > > > handling
> > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > failure, and causes the follower to stop. The
> > > latter
> > > > > is
> > > > > > > for
> > > > > > > > > topic
> > > > > > > > > > > > > deletion
> > > > > > > > > > > > > > > > and partition reassignment, and causes the
> > > replica
> > > > > to be
> > > > > > > > > deleted.
> > > > > > > > > > > > > Since we
> > > > > > > > > > > > > > > > are deprecating StopReplica, could we
> document
> > > what
> > > > > > > triggers
> > > > > > > > > the
> > > > > > > > > > > > > stopping
> > > > > > > > > > > > > > > > of a follower and the deleting of a replica
> now?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > RemoveTopic triggers deletion.  In general the
> > > > > > > functionality of
> > > > > > > > > > > > > StopReplica is subsumed by the metadata records.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 102. Should we include the metadata topic in
> the
> > > > > > > > > > > MetadataResponse?
> > > > > > > > > > > > > If so,
> > > > > > > > > > > > > > > > when it will be included and what will the
> > > metadata
> > > > > > > response
> > > > > > > > > look
> > > > > > > > > > > > > like?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > No, it won't be included in the metadata
> response
> > > sent
> > > > > back
> > > > > > > > > from
> > > > > > > > > > > the
> > > > > > > > > > > > > brokers.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 103. "The active controller assigns the
> broker a
> > > new
> > > > > > > broker
> > > > > > > > > > > epoch,
> > > > > > > > > > > > > based on
> > > > > > > > > > > > > > > > the latest committed offset in the log." This
> > > seems
> > > > > > > > > inaccurate
> > > > > > > > > > > since
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > latest committed offset doesn't always
> advance on
> > > > > every
> > > > > > > log
> > > > > > > > > > > append.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Given that the new broker epoch won't be
> visible
> > > until
> > > > > the
> > > > > > > > > commit
> > > > > > > > > > > has
> > > > > > > > > > > > > happened, I have changed this to "the next
> available
> > > > > offset in
> > > > > > > the
> > > > > > > > > log"
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 104. REGISTERING(1) : It says "Otherwise, the
> > > broker
> > > > > > > moves
> > > > > > > > > into
> > > > > > > > > > > the
> > > > > > > > > > > > > FENCED
> > > > > > > > > > > > > > > > state.". It seems this should be RUNNING?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 105. RUNNING: Should we require the broker to
> > > catch
> > > > > up
> > > > > > > to the
> > > > > > > > > > > > > metadata log
> > > > > > > > > > > > > > > > to get into this state?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > For 104 and 105, these sections have been
> reworked.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > best,
> > > > > > > > > > > > > > > Colin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Oct 23, 2020 at 1:20 PM Colin McCabe
> <
> > > > > > > > > [email protected]
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Oct 21, 2020, at 05:51, Tom Bentley
> > > wrote:
> > > > > > > > > > > > > > > > > > Hi Colin,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Mon, Oct 19, 2020, at 08:59, Ron
> Dagostino
> > > > > wrote:
> > > > > > > > > > > > > > > > > > > > Hi Colin.  Thanks for the hard work
> on
> > > this
> > > > > KIP.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I have some questions about what
> happens
> > > to a
> > > > > > > broker
> > > > > > > > > > > when it
> > > > > > > > > > > > > becomes
> > > > > > > > > > > > > > > > > > > > fenced (e.g. because it can't send a
> > > > > heartbeat
> > > > > > > > > request to
> > > > > > > > > > > > > keep its
> > > > > > > > > > > > > > > > > > > > lease).  The KIP says "When a broker
> is
> > > > > fenced,
> > > > > > > it
> > > > > > > > > cannot
> > > > > > > > > > > > > process any
> > > > > > > > > > > > > > > > > > > > client requests.  This prevents
> brokers
> > > which
> > > > > > > are not
> > > > > > > > > > > > > receiving
> > > > > > > > > > > > > > > > > > > > metadata updates or that are not
> > > receiving
> > > > > and
> > > > > > > > > processing
> > > > > > > > > > > > > them fast
> > > > > > > > > > > > > > > > > > > > enough from causing issues to
> clients."
> > > And
> > > > > in
> > > > > > > the
> > > > > > > > > > > > > description of the
> > > > > > > > > > > > > > > > > > > > FENCED(4) state it likewise says
> "While
> > > in
> > > > > this
> > > > > > > > > state,
> > > > > > > > > > > the
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > > not respond to client requests."  It
> > > makes
> > > > > sense
> > > > > > > > > that a
> > > > > > > > > > > > > fenced broker
> > > > > > > > > > > > > > > > > > > > should not accept producer requests
> -- I
> > > > > assume
> > > > > > > any
> > > > > > > > > such
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > > would result in
> > > NotLeaderOrFollowerException.
> > > > > > > But
> > > > > > > > > what
> > > > > > > > > > > > > about KIP-392
> > > > > > > > > > > > > > > > > > > > (fetch from follower) consumer
> > > requests?  It
> > > > > is
> > > > > > > > > > > conceivable
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > > > > could continue.  Related to that,
> would a
> > > > > fenced
> > > > > > > > > broker
> > > > > > > > > > > > > continue to
> > > > > > > > > > > > > > > > > > > > fetch data for partitions where it
> > > thinks it
> > > > > is a
> > > > > > > > > > > follower?
> > > > > > > > > > > > > Even if
> > > > > > > > > > > > > > > > > > > > it rejects consumer requests it might
> > > still
> > > > > > > continue
> > > > > > > > > to
> > > > > > > > > > > > > fetch as a
> > > > > > > > > > > > > > > > > > > > follower.  Might it be helpful to
> clarify
> > > > > both
> > > > > > > > > decisions
> > > > > > > > > > > > > here?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi Ron,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Good question.  I think a fenced broker
> > > should
> > > > > > > > > continue to
> > > > > > > > > > > > > fetch on
> > > > > > > > > > > > > > > > > > > partitions it was already fetching
> before
> > > it
> > > > > was
> > > > > > > > > fenced,
> > > > > > > > > > > > > unless it
> > > > > > > > > > > > > > > > > hits a
> > > > > > > > > > > > > > > > > > > problem.  At that point it won't be
> able to
> > > > > > > continue,
> > > > > > > > > > > since it
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > the new metadata.  For example, it
> won't
> > > know
> > > > > about
> > > > > > > > > > > leadership
> > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the partitions it's fetching.  The
> > > rationale
> > > > > for
> > > > > > > > > > > continuing to
> > > > > > > > > > > > > fetch
> > > > > > > > > > > > > > > > > is to
> > > > > > > > > > > > > > > > > > > try to avoid disruptions as much as
> > > possible.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I don't think fenced brokers should
> accept
> > > > > client
> > > > > > > > > requests.
> > > > > > > > > > > > > The issue
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > that the fenced broker may or may not
> have
> > > any
> > > > > > > data it
> > > > > > > > > is
> > > > > > > > > > > > > supposed to
> > > > > > > > > > > > > > > > > > > have.  It may or may not have applied
> any
> > > > > > > configuration
> > > > > > > > > > > > > changes, etc.
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > it is supposed to have applied.  So it
> > > could
> > > > > get
> > > > > > > pretty
> > > > > > > > > > > > > confusing, and
> > > > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > > > potentially waste the client's time.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > When fenced, how would the broker reply
> to a
> > > > > client
> > > > > > > > > which did
> > > > > > > > > > > > > make a
> > > > > > > > > > > > > > > > > > request?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi Tom,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The broker will respond with a retryable
> error
> > > in
> > > > > that
> > > > > > > > > case.
> > > > > > > > > > > Once
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > client has re-fetched its metadata, it
> will no
> > > > > longer
> > > > > > > see
> > > > > > > > > the
> > > > > > > > > > > > > fenced broker
> > > > > > > > > > > > > > > > > as part of the cluster.  I added a note to
> the
> > > KIP.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > best,
> > > > > > > > > > > > > > > > > Colin
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Tom
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Reply via email to