Re: Retrieve most-recent-n messages from kafka topic

2013-07-21 Thread Shane Moriah
Thanks Johan, I converted your code to vanilla java with a few small modifications (included below in case anyone wants to use it) and ran it a few times. Seems like it works ok for the quick peek use case, but I wouldn't recommend anyone rely on the accuracy of it since I find, at least in our c

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Johan Lundahl
Here is my current (very hacky) piece of code handling this part: def getLastMessages(fetchSize: Int = 1): List[String] = { val sConsumer = new SimpleConsumer(clusterip, 9092, 1000, 1024000) val currentOffset = sConsumer.getOffsetsBefore(topic, 0, -1, 3) val fetchRequest = new F

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Shane Moriah
I have a similar use-case to Johan. We do stream processing off the topics in the backend but I'd like to expose a recent sample of a topic's data to a front-end web-app (just in a synchronous, click-a-button-and-see-results fashion). If I can only start from the last file offset 500MB behind cur

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Johan Lundahl
I've had a similar use case where we want to browse and display the latest few messages in different topics in a webapp. This kind of works by doing as you describe; submitting a FetchRequest with an offset of messages_desired * avg_bytes_per_message plus a bit more. You'll get the ByteBuffer and

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread David Arthur
There is not index-based access to messages in 0.7 like there is in 0.8. You have to start from a known good offset and iterate through the messages. What's your use case? Running a job periodically that reads the latest N message from the queue? Is it impractical to run from the last known of

Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Shane Moriah
We're running Kafka 0.7 and I'm hitting some issues trying to access the newest n messages in a topic (or at least in a broker/partition combo) and wondering if my use case just isn't supported or if I'm missing something. What I'd like to be able to do is get the most recent offset from a broker/