Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Ron Dagostino Tue, 10 Nov 2020 13:34:59 -0800

Hi Coln.  Ignore my previous question about ConfigRecord.ResourceType
having to be a string -- I now see that
org.apache.kafka.common.config.ConfigResource defines the types of
configs for an int8.


I do have a question about how the broker will connect to the
controller.  The KIP says that controller.listener.names is "required
if this process is a KIP-500 controller" and "Despite the similar
name, note that this is different from the "control plane listener"
introduced by KIP-291.  The "control plane listener" is used on
brokers, not on controllers."  This leads me to believe that
controller.listener.names is not used on brokers.  But I believe it
will be required, otherwise the only information that the broker will
have is a list of hosts and ports (from controller.connect).  I
believe the broker will require a value in controller.listener.names
and then the broker will take that value (the first in the list?) and
convert that to a security protocol via
listener.security.protocol.map.  Also, if the security protocol is
SASL_{PLAINTEXT,SSL}, what config will defne the SASL mechanism that
the broker should use?

Ron

On Wed, Oct 28, 2020 at 1:29 PM Ron Dagostino <[email protected]> wrote:
>
> HI again, Colin.  I just noticed that both ConfigRecord and
> AccessControlRecord have a ResourceType of type int8.  I thought that
> config resources are in the set {topics, clients, users, brokers} and
> ACL resource types are a different set as defined by the
> org.apache.kafka.common.resource.ResourceType enum.  Does
> ConfigRecord.ResourceType need to be a String?
>
> Ron
>
> On Sun, Oct 25, 2020 at 6:04 AM Ron Dagostino <[email protected]> wrote:
> >
> > Hi Colin and Jun.
> >
> > Regarding these issues:
> >
> > 83.1 It seems that the broker can transition from FENCED to RUNNING
> > without registering for a new broker epoch. I am not sure how this
> > works. Once the controller fences a broker, there is no need for the
> > controller to keep the boker epoch around. So, if the fenced broker's
> > heartbeat request with the existing broker epoch will be rejected,
> > leading the broker back to the FENCED state again.; 104.
> > REGISTERING(1) : It says "Otherwise, the broker moves into the FENCED
> > state.". It seems this should be RUNNING?
> >
> > When would/could a broker re-register -- i.e. send
> > BrokerRegistrationRequest again once it receives a
> > BrokerRegistrationResponse containing no error and its broker epoch?
> > The text states that "Once the period has elapsed, if the broker has
> > not renewed its registration via a heartbeat, it must re-register."
> > But the broker state machine only mentions any type of
> > registration-related event in the REGISTERING state ("While in this
> > state, the broker tries to register with the active controller");
> > there is no other broker state in the text that mentions the
> > possibility of re-registering, and the broker state machine has no
> > transition back to the REGISTERING state.
> >
> > Also, the text now states that there are "three broker registration
> > states: unregistered, registered but fenced, and registered and
> > active." It would be good to map these onto the formal broker state
> > machine so we know which "registration states" a broker can be in for
> > each state within its broker state machine.  It is not clear if there
> > is a way for a broker to go backwards into the "unregistered" broker
> > registration state.  I suspect it can only flip-flop between
> > registered but fenced/registered and active as the broker flip-flops
> > between ACTIVE and FENCED, and this would imply that a broker is never
> > strictly required to re-register -- though the option isn't precluded.
> >
> > Does a broker JVM keep it's assigned broker epoch throughout the life
> > of the JVM?  The BrokerRegistrationRequest includes a place for the
> > broker to specify its current broker epoch, but that would only be
> > useful if the broker is re-registering.  If a broker were to
> > re-register, the data in the request might seem to imply that it could
> > do so to specify dynamic changes to its features or endpoints, but
> > those dynamic changes happen centrally, so that doesn't seem to be a
> > valid reason to re-register.  So I do not yet see a reason for
> > re-registering despite the text "if the broker has not renewed its
> > registration via a heartbeat, it must re-register."
> >
> > It feels to me that a broker would keep its epoch throughout the life
> > of its JVM and it would never re-register, and the controller would
> > remember/maintain the broker epoch when it fences a broker; the broker
> > would continue to try sending heartbeat requests while it is fenced,
> > and it would continue to do so until the process is killed via an
> > external signal.  If the controller eventually does respond with the
> > broker's next state then that next state will either be ACTIVE
> > (meaning communication has been restored; the return broker epoch will
> > be the same one that the broker JVM has had throughout its lifetime
> > and that it provided in the heartbeat request); or the next state will
> > be PENDING_CONTROLLED_SHUTDOWN if some other JVM process has since
> > started with the same broker ID.
> >
> > I hope that helps the discussion.  Thanks for the great questions,
> > Jun, and your hard work and responses, Colin.
> >
> > Ron
> >
> >
> >
> >
> >
> > On Sat, Oct 24, 2020 at 4:08 AM Tom Bentley <[email protected]> wrote:
> > >
> > > Hi Colin,
> > >
> > > Which error code in particular though? Because so far as I'm aware there's
> > > no existing error code which really captures this situation and creating a
> > > new one would not be backward compatible.
> > >
> > > Cheers,
> > >
> > > Tom
> > >
> > > On Sat, Oct 24, 2020 at 12:20 AM Jun Rao <[email protected]> wrote:
> > >
> > > > Hi, Colin,
> > > >
> > > > Thanks for the reply. A few more comments.
> > > >
> > > > 55. There is still text that favors new broker registration. "When a 
> > > > broker
> > > > first starts up, when it is in the INITIAL state, it will always "win"
> > > > broker ID conflicts.  However, once it is granted a lease, it 
> > > > transitions
> > > > out of the INITIAL state.  Thereafter, it may lose subsequent conflicts 
> > > > if
> > > > its broker epoch is stale.  (See KIP-380 for some background on broker
> > > > epoch.)  The reason for favoring new processes is to accommodate the 
> > > > common
> > > > case where a process is killed with kill -9 and then restarted.  We 
> > > > want it
> > > > to be able to reclaim its old ID quickly in this case."
> > > >
> > > > 80.1 Sounds good. Could you document that listeners is a required config
> > > > now? It would also be useful to annotate other required configs. For
> > > > example, controller.connect should be required.
> > > >
> > > > 80.2 Could you list all deprecated existing configs? Another one is
> > > > control.plane.listener.name since the controller no longer sends
> > > > LeaderAndIsr, UpdateMetadata and StopReplica requests.
> > > >
> > > > 83.1 It seems that the broker can transition from FENCED to RUNNING 
> > > > without
> > > > registering for a new broker epoch. I am not sure how this works. Once 
> > > > the
> > > > controller fences a broker, there is no need for the controller to keep 
> > > > the
> > > > boker epoch around. So, if the fenced broker's heartbeat request with 
> > > > the
> > > > existing broker epoch will be rejected, leading the broker back to the
> > > > FENCED state again.
> > > >
> > > > 83.5 Good point on KIP-590. Then should we expose the controller for
> > > > debugging purposes? If not, we should deprecate the controllerID field 
> > > > in
> > > > MetadataResponse?
> > > >
> > > > 90. We rejected the shared ID with just one reason "This is not a good 
> > > > idea
> > > > because NetworkClient assumes a single ID space.  So if there is both a
> > > > controller 1 and a broker 1, we don't have a way of picking the "right"
> > > > one." This doesn't seem to be a strong reason. For example, we could
> > > > address the NetworkClient issue with the node type as you pointed out or
> > > > using the negative value of a broker ID as the controller ID.
> > > >
> > > > 100. In KIP-589
> > > > <
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > >,
> > > > the broker reports all offline replicas due to a disk failure to the
> > > > controller. It seems this information needs to be persisted to the 
> > > > metadata
> > > > log. Do we have a corresponding record for that?
> > > >
> > > > 101. Currently, StopReplica request has 2 modes, without deletion and 
> > > > with
> > > > deletion. The former is used for controlled shutdown and handling disk
> > > > failure, and causes the follower to stop. The latter is for topic 
> > > > deletion
> > > > and partition reassignment, and causes the replica to be deleted. Since 
> > > > we
> > > > are deprecating StopReplica, could we document what triggers the 
> > > > stopping
> > > > of a follower and the deleting of a replica now?
> > > >
> > > > 102. Should we include the metadata topic in the MetadataResponse? If 
> > > > so,
> > > > when it will be included and what will the metadata response look like?
> > > >
> > > > 103. "The active controller assigns the broker a new broker epoch, 
> > > > based on
> > > > the latest committed offset in the log." This seems inaccurate since the
> > > > latest committed offset doesn't always advance on every log append.
> > > >
> > > > 104. REGISTERING(1) : It says "Otherwise, the broker moves into the 
> > > > FENCED
> > > > state.". It seems this should be RUNNING?
> > > >
> > > > 105. RUNNING: Should we require the broker to catch up to the metadata 
> > > > log
> > > > to get into this state?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > >
> > > > On Fri, Oct 23, 2020 at 1:20 PM Colin McCabe <[email protected]> wrote:
> > > >
> > > > > On Wed, Oct 21, 2020, at 05:51, Tom Bentley wrote:
> > > > > > Hi Colin,
> > > > > >
> > > > > > On Mon, Oct 19, 2020, at 08:59, Ron Dagostino wrote:
> > > > > > > > Hi Colin.  Thanks for the hard work on this KIP.
> > > > > > > >
> > > > > > > > I have some questions about what happens to a broker when it
> > > > becomes
> > > > > > > > fenced (e.g. because it can't send a heartbeat request to keep 
> > > > > > > > its
> > > > > > > > lease).  The KIP says "When a broker is fenced, it cannot 
> > > > > > > > process
> > > > any
> > > > > > > > client requests.  This prevents brokers which are not receiving
> > > > > > > > metadata updates or that are not receiving and processing them 
> > > > > > > > fast
> > > > > > > > enough from causing issues to clients." And in the description 
> > > > > > > > of
> > > > the
> > > > > > > > FENCED(4) state it likewise says "While in this state, the 
> > > > > > > > broker
> > > > > does
> > > > > > > > not respond to client requests."  It makes sense that a fenced
> > > > broker
> > > > > > > > should not accept producer requests -- I assume any such 
> > > > > > > > requests
> > > > > > > > would result in NotLeaderOrFollowerException.  But what about
> > > > KIP-392
> > > > > > > > (fetch from follower) consumer requests?  It is conceivable that
> > > > > these
> > > > > > > > could continue.  Related to that, would a fenced broker 
> > > > > > > > continue to
> > > > > > > > fetch data for partitions where it thinks it is a follower?  
> > > > > > > > Even
> > > > if
> > > > > > > > it rejects consumer requests it might still continue to fetch 
> > > > > > > > as a
> > > > > > > > follower.  Might it be helpful to clarify both decisions here?
> > > > > > >
> > > > > > > Hi Ron,
> > > > > > >
> > > > > > > Good question.  I think a fenced broker should continue to fetch 
> > > > > > > on
> > > > > > > partitions it was already fetching before it was fenced, unless it
> > > > > hits a
> > > > > > > problem.  At that point it won't be able to continue, since it
> > > > doesn't
> > > > > have
> > > > > > > the new metadata.  For example, it won't know about leadership
> > > > changes
> > > > > in
> > > > > > > the partitions it's fetching.  The rationale for continuing to 
> > > > > > > fetch
> > > > > is to
> > > > > > > try to avoid disruptions as much as possible.
> > > > > > >
> > > > > > > I don't think fenced brokers should accept client requests.  The
> > > > issue
> > > > > is
> > > > > > > that the fenced broker may or may not have any data it is 
> > > > > > > supposed to
> > > > > > > have.  It may or may not have applied any configuration changes, 
> > > > > > > etc.
> > > > > that
> > > > > > > it is supposed to have applied.  So it could get pretty confusing,
> > > > and
> > > > > also
> > > > > > > potentially waste the client's time.
> > > > > > >
> > > > > > >
> > > > > > When fenced, how would the broker reply to a client which did make a
> > > > > > request?
> > > > > >
> > > > >
> > > > > Hi Tom,
> > > > >
> > > > > The broker will respond with a retryable error in that case.  Once the
> > > > > client has re-fetched its metadata, it will no longer see the fenced
> > > > broker
> > > > > as part of the cluster.  I added a note to the KIP.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Tom
> > > > > >
> > > > >
> > > >

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

Reply via email to