Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Johan Lundahl
Here is my current (very hacky) piece of code handling this part: def getLastMessages(fetchSize: Int = 1): List[String] = { val sConsumer = new SimpleConsumer(clusterip, 9092, 1000, 1024000) val currentOffset = sConsumer.getOffsetsBefore(topic, 0, -1, 3) val fetchRequest = new F

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Shane Moriah
I have a similar use-case to Johan. We do stream processing off the topics in the backend but I'd like to expose a recent sample of a topic's data to a front-end web-app (just in a synchronous, click-a-button-and-see-results fashion). If I can only start from the last file offset 500MB behind cur

RE: Duplicate Messages on the Consumer

2013-07-19 Thread Sybrandy, Casey
Hello, No, we couldn't check the broker logs because the data is obfuscated, so we can't just look at the files and tell. It looks like our dev system may be experiencing the same issue, so I did turn of the obfuscation and we'll monitor it. However, on our production system where we were see

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Johan Lundahl
I've had a similar use case where we want to browse and display the latest few messages in different topics in a webapp. This kind of works by doing as you describe; submitting a FetchRequest with an offset of messages_desired * avg_bytes_per_message plus a bit more. You'll get the ByteBuffer and

Re: Topic corruption from hardware failure (0.7.1)

2013-07-19 Thread Blake Smith
Just to close the loop here for posterity: 1. For the directory topic name corruption, it looks like there's still an outstanding issue in JIRA: https://issues.apache.org/jira/browse/KAFKA-411 2. Ensuring log recovery is run seems to be fixed in commit 75fc5eab35aa33cffd9c09a2070dfe287db0ef4e ( ht

Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread David Arthur
There is not index-based access to messages in 0.7 like there is in 0.8. You have to start from a known good offset and iterate through the messages. What's your use case? Running a job periodically that reads the latest N message from the queue? Is it impractical to run from the last known of

Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread Shane Moriah
We're running Kafka 0.7 and I'm hitting some issues trying to access the newest n messages in a topic (or at least in a broker/partition combo) and wondering if my use case just isn't supported or if I'm missing something. What I'd like to be able to do is get the most recent offset from a broker/

Re: Kafka consumer not consuming events

2013-07-19 Thread Nihit Purwar
Hello Jun, Sorry for the delay in getting the logs. Here are the 3 logs from the 3 servers with trace level as suggested: https://docs.google.com/file/d/0B5etsywBa-bkQnBESUJzNV9yRWc/edit?usp=sharing Please have a look and let us know if you need anything else to further debug this problem. Tha