Hi,

I’ve read and become somewhat indoctrinated by the possibility discussed in I 
Heart Logs of having an event stream written to Kafka feeding a process that 
loads a database. That process should, I think, store the last offset it has 
processed in its downstream database rather than in ZooKeeper. The sequence of 
API calls that I’ve found to accomplish this doesn’t match my expectations from 
reading the book:

  1. Instantiate a KafkaConsumer with some randomish group.id and 
enable.auto.commit=false.
  2. Subscribe to the event stream topic
  3. poll(0, same topic as above) and discard whatever comes back (to enable 
the next step)
  4. seek(offset)
  5. poll(N, same topic) inside a loop, handling messages

Is this the right approach? I can’t help but feel like I am missing something 
obvious, because the book makes it sound like this is a common pattern but I 
feel like I’m using this API against its intended purpose.

For the time being, I have just one server, one ZK, one partition and a handful 
of topics while I experiment.

Thanks for your help,

-- 
Daniel Lyons




Reply via email to