Thanks Colin for the detail KIP. I have a few comments and questions.

In the KIP's Motivation and Overview you mentioned the LeaderAndIsr and
UpdateMetadata RPC. For example, "updates which the controller pushes, such
as LeaderAndIsr and UpdateMetadata messages". Is your thinking that we will
use MetadataFetch as a replacement to just UpdateMetadata only and add
topic configuration in this state?

In the section "Broker Metadata Management", you mention "Just like with a
fetch request, the broker will track the offset of the last updates it
fetched". To keep the log consistent Raft requires that the followers keep
all of the log entries (term/epoch and offset) that are after the
highwatermark. Any log entry before the highwatermark can be
compacted/snapshot. Do we expect the MetadataFetch API to only return log
entries up to the highwatermark?  Unlike the Raft replication API which
will replicate/fetch log entries after the highwatermark for consensus?

In section "Broker Metadata Management", you mention "the controller will
send a full metadata image rather than a series of deltas". This KIP
doesn't go into the set of operations that need to be supported on top of
Raft but it would be interested if this "full metadata image" could be
express also as deltas. For example, assuming we are replicating a map this
"full metadata image" could be a sequence of "put" operations (znode create
to borrow ZK semantics).

In section "Broker Metadata Management", you mention "This request will
double as a heartbeat, letting the controller know that the broker is
alive". In section "Broker State Machine", you mention "The MetadataFetch
API serves as this registration mechanism". Does this mean that the
MetadataFetch Request will optionally include broker configuration
information? Does this also mean that MetadataFetch request will result in
a "write"/AppendEntries through the Raft replication protocol before you
can send the associated MetadataFetch Response?

In section "Broker State", you mention that a broker can transition to
online after it is caught with the metadata. What do you mean by this?
Metadata is always changing. How does the broker know that it is caught up
since it doesn't participate in the consensus or the advancement of the
highwatermark?

In section "Start the controller quorum nodes", you mention "Once it has
taken over the /controller node, the active controller will proceed to load
the full state of ZooKeeper.  It will write out this information to the
quorum's metadata storage.  After this point, the metadata quorum will be
the metadata store of record, rather than the data in ZooKeeper." During
this migration do should we expect to have a small period controller
unavailability while the controller replicas this state to all of the raft
nodes in the controller quorum and we buffer new controller API requests?

Thanks!
-Jose

Reply via email to