This email is to kick-off some discussion around the changes we want to make on the new consumer APIs as well as their semantics. Here are a not-comprehensive list of items in my mind:
1. Poll(timeout): current definition of timeout states "The time, in milliseconds, spent waiting in poll if data is not available. If 0, waits indefinitely." While in the current implementation, we have different semantics as stated, for example: a) poll(timeout) can return before "timeout" elapsed with empty consumed data. b) poll(timeout) can return after more than "timeout" elapsed due to blocking event like join-group, coordinator discovery, etc. We should think a bit more on what semantics we really want to provide and how to provide it in implementation. 2. Thread safeness: currently we have a coarsen-grained locking mechanism that provides thread safeness but blocks commit / position / etc calls while poll() is in process. We are considering to remove the coarsen-grained locking with an additional Consumer.wakeup() call to break the polling, and instead suggest users to have one consumer client per thread, which aligns with the design of a single-threaded consumer (KAFKA-2123). 3. Commit(): we want to improve the async commit calls to add a callback handler upon commit completes, and guarantee ordering of commit calls with retry policies (KAFKA-2168). In addition, we want to extend the API to expose attaching / fetching offset metadata stored in the Kafka offset manager. 4. OffsetFetchRequest: currently for handling OffsetCommitRequest we check the generation id and the assigned partitions before accepting the request if the group is using Kafka for partition management, but for OffsetFetchRequest we cannot do this checking since it does not include groupId / consumerId information. Do people think this is OK or we should add this as we did in OffsetCommitRequest? 5. New APIs: there are some other requests to add: a) offsetsBeforeTime(timestamp): or would seekToEnd and seekToBeginning sufficient? b) listTopics(): or should we just enforce users to use AdminUtils for such operations? There may be other issues that I have missed here, so folks just bring it up if you thought about anything else. -- Guozhang