Hi, I’ve read and become somewhat indoctrinated by the possibility discussed in I Heart Logs of having an event stream written to Kafka feeding a process that loads a database. That process should, I think, store the last offset it has processed in its downstream database rather than in ZooKeeper. The sequence of API calls that I’ve found to accomplish this doesn’t match my expectations from reading the book:
1. Instantiate a KafkaConsumer with some randomish group.id and enable.auto.commit=false. 2. Subscribe to the event stream topic 3. poll(0, same topic as above) and discard whatever comes back (to enable the next step) 4. seek(offset) 5. poll(N, same topic) inside a loop, handling messages Is this the right approach? I can’t help but feel like I am missing something obvious, because the book makes it sound like this is a common pattern but I feel like I’m using this API against its intended purpose. For the time being, I have just one server, one ZK, one partition and a handful of topics while I experiment. Thanks for your help, -- Daniel Lyons