Hi Jason, Very cool KIP! A couple of questions: - I'm guessing the selector will be invoke after each rebalance so every time the consumer is assigned a partition it will be able to select it. Is that true?
- From the selector API, I'm not sure how the consumer will be able to address some of the choices mentioned in "Finding the preferred follower". Especially the available bandwidth and the load balancing. By only having the list of Nodes, a consumer can pick the nereast replica (assuming the rack field means anything to users) or balance its own bandwidth but that might not necessarily mean improved performance or a balanced load on the brokers. Thanks On Mon, Dec 3, 2018 at 11:35 AM Stanislav Kozlovski <stanis...@confluent.io> wrote: > > Hey Jason, > > This is certainly a very exciting KIP. > I assume that no changes will be made to the offset commits and they will > continue to be sent to the group coordinator? > > I also wanted to address metrics - have we considered any changes there? I > imagine that it would be valuable for users to be able to differentiate > between which consumers' partitions are fetched from replicas and which > aren't. I guess that would need to be addressed both in the server's > fetcher lag metrics and in the consumers. > > Thanks, > Stanislav > > On Wed, Nov 28, 2018 at 10:08 PM Jun Rao <j...@confluent.io> wrote: > > > Hi, Jason, > > > > Thanks for the KIP. Looks good overall. A few minor comments below. > > > > 1. The section on handling FETCH_OFFSET_TOO_LARGE error says "Use the > > OffsetForLeaderEpoch API to verify the current position with the leader". > > The OffsetForLeaderEpoch request returns log end offset if the request > > leader epoch is the latest. So, we won't know the true high watermark from > > that request. It seems that the consumer still needs to send ListOffset > > request to the leader to obtain high watermark? > > > > 2. If a non in-sync replica receives a fetch request from a consumer, > > should it return a new type of error like ReplicaNotInSync? > > > > 3. Could ReplicaSelector be closable? > > > > 4. Currently, the ISR propagation from the leader to the controller can be > > delayed up to 60 secs through ReplicaManager.IsrChangePropagationInterval. > > In that window, the consumer could still be consuming from a non in-sync > > replica. The relatively large delay is mostly for reducing the ZK writes > > and the watcher overhead. Not sure what's the best way to address this. We > > could potentially make this configurable. > > > > 5. It may be worth mentioning that, to take advantage of affinity, one may > > also want to have a customized PartitionAssignor to have an affinity aware > > assignment in addition to a customized ReplicaSelector. > > > > Thanks, > > > > Jun > > > > On Wed, Nov 21, 2018 at 12:54 PM Jason Gustafson <ja...@confluent.io> > > wrote: > > > > > Hi All, > > > > > > I've posted a KIP to add the often-requested support for fetching from > > > followers: > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica > > > . > > > Please take a look and let me know what you think. > > > > > > Thanks, > > > Jason > > > > > > > > -- > Best, > Stanislav