[ https://issues.apache.org/jira/browse/KAFKA-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289744#comment-14289744 ]
Jay Kreps edited comment on KAFKA-1655 at 1/23/15 7:03 PM: ----------------------------------------------------------- I believe this is handled in the new consumer API. Can you take a look at the APIs and see what you think. Essentially you would do something like {code} consumer.subscribe(topic, partition) consumer.seek(topic, partition, offset) records = consumer.poll(timeout) consumer.unsubscribe(topic, partition) {code} The subscribe/unsubscribe and seek is just an in-memory modification. was (Author: jkreps): I believe this is handled in the new consumer API. Can you take a look at the APIs and see what you think. Essentially you would do something like {code} consumer.subscribe(topic, partition) consumer.seek(topic, partition, offset) records = consumer.poll(timeout) consumer.unsubscribe(topic, partition) {code} The subscribe/unsubscribe is just an in-memory modification. > Allow high performance SimpleConsumer use cases to still work with new Kafka > 0.9 consumer APIs > ---------------------------------------------------------------------------------------------- > > Key: KAFKA-1655 > URL: https://issues.apache.org/jira/browse/KAFKA-1655 > Project: Kafka > Issue Type: New Feature > Components: consumer > Affects Versions: 0.9.0 > Reporter: Valentin > > Hi guys, > currently Kafka allows consumers to either chose the low level or the high > level API, depending on the specific requirements of the consumer > implementation. However, I was told that the current low level API > (SimpleConsumer) will be deprecated once the new Kafka 0.9 consumer APIs are > available. > In this case it would be good, if we can ensure that the new API does offer > some ways to get similar performance for use cases which perfectly fit the > old SimpleConsumer API approach. > Example Use Case: > A high throughput HTTP API wrapper for consumer requests which gets HTTP REST > calls to retrieve data for a specific set of topic partitions and offsets. > Here the SimpleConsumer is perfect because it allows connection pooling in > the HTTP API web application with one pool per existing kafka broker and the > web application can handle the required metadata managment to know which pool > to fetch a connection for, for each used topic partition. This means > connections to Kafka brokers can remain open/pooled and > connection/reconnection and metadata overhead is minimized. > To achieve something similar with the new Kafka 0.9 consumer APIs, it would > be good if it could: > - provide a lowlevel call to connect to a specific broker and to read data > from a topic+partition+offset > OR > - ensure that subscribe/unsubscribe calls are very cheap and can run without > requiring any network traffic. If I subscribe to a topic partition for which > the same broker is the leader as the last topic partition which was in use > for this consumer API connection, then the consumer API implementation should > recognize this and should not do any disconnects/reconnects and just reuse > the existing connection to that kafka broker. > Or put differently, it should be possible to do external metadata handling in > the consumer API client and the client should be able to pool consumer API > connections effectively by having one pool per Kafka broker. > Greetings > Valentin -- This message was sent by Atlassian JIRA (v6.3.4#6332)