[ https://issues.apache.org/jira/browse/KAFKA-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin McCabe updated KAFKA-19130: --------------------------------- Description: When the controller starts up (or becomes active after being inactive), we add all of the registered brokers to BrokerRegistrationTracker so that they will not be accidentally fenced the next time we are looking for a broker to fence. We do this because the state in BrokerRegistrationTracker is "soft state" (it doesn't appear in the metadata log), and the newly active controller starts off with no soft state. (Its soft state will be populated by the brokers sending heartbeat requests to it over time.) In the case of fenced brokers, we are not worried about accidentally fencing the broker due to it being missing from BrokerRegistrationTracker for a while (it's already fenced). Therefore, it should be reasonable to just not add fenced brokers to the tracker initially. One case where this change will have a positive impact is for people running single-node demonstration clusters in combined KRaft mode. In that case, when the single-node cluster is taken down and restarted, it currently will have to wait about 9 seconds for the broker to come up and re-register. With this change, the broker should be able to re-register immediately. One possible negative impact is that if there is a controller failover, it will open a small window where a broker with the same ID as a fenced broker could re-register. However, our detection of duplicate broker IDs is best-effort (and duplicate broker IDs are an administrative mistake), so this downside seems acceptable. > Do not add fenced brokers to BrokerRegistrationTracker on startup > ----------------------------------------------------------------- > > Key: KAFKA-19130 > URL: https://issues.apache.org/jira/browse/KAFKA-19130 > Project: Kafka > Issue Type: Bug > Reporter: Colin McCabe > Assignee: Colin McCabe > Priority: Minor > > When the controller starts up (or becomes active after being inactive), we > add all of the > registered brokers to BrokerRegistrationTracker so that they will not be > accidentally fenced the > next time we are looking for a broker to fence. We do this because the state > in > BrokerRegistrationTracker is "soft state" (it doesn't appear in the metadata > log), and the newly > active controller starts off with no soft state. (Its soft state will be > populated by the brokers > sending heartbeat requests to it over time.) > In the case of fenced brokers, we are not worried about accidentally fencing > the broker due to it > being missing from BrokerRegistrationTracker for a while (it's already > fenced). Therefore, it > should be reasonable to just not add fenced brokers to the tracker initially. > One case where this change will have a positive impact is for people running > single-node demonstration > clusters in combined KRaft mode. In that case, when the single-node cluster > is taken down and > restarted, it currently will have to wait about 9 seconds for the broker to > come up and > re-register. With this change, the broker should be able to re-register > immediately. > One possible negative impact is that if there is a controller failover, it > will open a small window > where a broker with the same ID as a fenced broker could re-register. > However, our detection of > duplicate broker IDs is best-effort (and duplicate broker IDs are an > administrative mistake), so > this downside seems acceptable. -- This message was sent by Atlassian Jira (v8.20.10#820010)