Re: Hadoop Summit Meetups

2014-06-09 Thread Robert Hodges
Thanks Neha. I am looking at the API call you recommended. Cheers, Robert On Mon, Jun 9, 2014 at 12:42 PM, Neha Narkhede wrote: > Is there a convenient way to fetch the last message posted on a particular > topic across all partitions? > > Not really, unless the message itself has some sort o

Re: org.apache.zookeeper.KeeperException$BadVersionException

2014-06-09 Thread Bongyeon Kim
No, I can see any ZK session expiration log. What I have to do to prevent this? Increasing 'zookeeper.session.timeout.ms' can help? On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao wrote: > This is probably related to kafka-1382. The root cause is likely ZK session > expiration in the broker. Did you

Re: Error: Could not find or load main class kafka.perf.ProducerPerformance

2014-06-09 Thread Prakash Gowri Shankor
Thanks Jun. That worked. I can run the perf producer now. On Mon, Jun 9, 2014 at 8:51 PM, Jun Rao wrote: > Was that a source download? Then you need to run ./gradlew jar first. > > Thanks, > > Jun > > > On Mon, Jun 9, 2014 at 4:57 PM, Prakash Gowri Shankor < > prakash.shan...@gmail.com> wrote:

Re: org.apache.zookeeper.KeeperException$BadVersionException

2014-06-09 Thread Jun Rao
This is probably related to kafka-1382. The root cause is likely ZK session expiration in the broker. Did you see any? Thanks, Jun On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim wrote: > Hi, team. > > I’m using 0.8.1. > I found some strange log repeatedly on server.log in one of my brokers and

Re: Error: Could not find or load main class kafka.perf.ProducerPerformance

2014-06-09 Thread Jun Rao
Was that a source download? Then you need to run ./gradlew jar first. Thanks, Jun On Mon, Jun 9, 2014 at 4:57 PM, Prakash Gowri Shankor < prakash.shan...@gmail.com> wrote: > Hi, > > Is the perf module bundled with 0.8.1.1 ? > When i try to run "kafka-producer-perf-test.sh" I see > *Error: Coul

Re: Handling un-decodable messages (Kafka 0.8.0)

2014-06-09 Thread Jun Rao
This is handled better in 0.8.1.1. In the same situation, you can resume the iteration on the next message. To do this in 0.8.0, you will have to get the binary data in ConsumerIterator and run the decoder in the consumer code. Thanks, Jun On Mon, Jun 9, 2014 at 9:40 AM, Christofer Hedbrandh w

org.apache.zookeeper.KeeperException$BadVersionException

2014-06-09 Thread Bongyeon Kim
Hi, team. I’m using 0.8.1. I found some strange log repeatedly on server.log in one of my brokers and it keeps logging until now. server.log == ... [2014-06-09 10:41:47,402] ERROR Conditional update of path /b

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Prakash Gowri Shankor
Thank you Guozhang. I've specified how i set and use the property in my previous mail. Can you tell me if that is fine ? I also noticed that the kafka-console-producer.sh takes a custom property(key-value) on the command line. Would it help to set this property directly on the command line of the p

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Guozhang Wang
In the new producer we are changing the default behavior back to pure random partitioning and let users to customize their own partitioning schemes if they want. For now reducing topic.metadata.refresh.interval.ms should help because the stickiness only persists until a metadata refresh. Guozhang

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Prakash Gowri Shankor
I have seen that mail thread. Here is what i tried: In my producer.properties I set topic.metadata.refresh.interval.ms=1000. I guess this means that the a different partition will be selected every second. Then I restart my producer as : ./kafka-console-producer.sh --broker-list localhost:9092 --

Error: Could not find or load main class kafka.perf.ProducerPerformance

2014-06-09 Thread Prakash Gowri Shankor
Hi, Is the perf module bundled with 0.8.1.1 ? When i try to run "kafka-producer-perf-test.sh" I see *Error: Could not find or load main class kafka.perf.ProducerPerformance* I then tried to do a ./gradlew perf which resulted in: *Task 'perf' not found in root project 'kafka-0.8.1.1-src'.* Pleas

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Guozhang Wang
Kane is right, please see this FAQ for details: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified ? On Mon, Jun 9, 2014 at 4:41 PM, Kane Kane wrote: > Last time I've checked it, producer sticks to partition

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Prakash Gowri Shankor
Is there a way to modify this duration ? This is not adhering to the "random" behavior that the documentation talks about. On Mon, Jun 9, 2014 at 4:41 PM, Kane Kane wrote: > Last time I've checked it, producer sticks to partition for 10 minutes. > > On Mon, Jun 9, 2014 at 4:13 PM, Prakash Gowri

Re: Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Kane Kane
Last time I've checked it, producer sticks to partition for 10 minutes. On Mon, Jun 9, 2014 at 4:13 PM, Prakash Gowri Shankor wrote: > Hi, > > This is with 0.8.1.1 and I ran the command line console consumer. > I have one broker, one producer and several consumers. I have one topic, > many partit

Strange partitioning behavior with 0.8.1.1

2014-06-09 Thread Prakash Gowri Shankor
Hi, This is with 0.8.1.1 and I ran the command line console consumer. I have one broker, one producer and several consumers. I have one topic, many partitions m, many consumers n, m=n , one consumer group defined for all the consumers >From using Kafka Monitor, I see that each partition is assign

Re: message.max.bytes and mirrormaker tool

2014-06-09 Thread Guozhang Wang
You can try to reduce batch.num.messages first and see if the throughput gets affected. For your second question, we do not have a good solution to that since the offsets are not consistent across data centers like you said. One way we did is to have the consumer consuming both data centers, but k

Re: message.max.bytes and mirrormaker tool

2014-06-09 Thread Kane Kane
What would you recommend in this case? Bump up the max.message.bytes, use sync producer or lower batch.num.messages? Also does it matter on which side mirrormaker is located, source or target one? And another related question: as I understand, potentially offsets might be different for source and

Re: Hadoop Summit Meetups

2014-06-09 Thread Neha Narkhede
Is there a convenient way to fetch the last message posted on a particular topic across all partitions? Not really, unless the message itself has some sort of a timestamp. Even then, the order that the broker applies to the log is only guaranteed per partition per client. So it is tricky to know t

Handling un-decodable messages (Kafka 0.8.0)

2014-06-09 Thread Christofer Hedbrandh
I have a custom Decoder for my messages (Thrift). I want to be able to handle "bad" messages that I can't decode. When the ConsumerIterator encounters a bad message, the exception thrown by my Decoder bubbles up and I can catch it and handle it. Subsequent calls to the ConsumerIterator give me Ille

Re: Hadoop Summit Meetups

2014-06-09 Thread Robert Hodges
Hi Gouzhang, Thanks for the response. Answers interpolated below. Cheers, Robert On Mon, Jun 9, 2014 at 8:15 AM, Guozhang Wang wrote: > Robert, > > Thanks for the description. Just want to clarify on some of the points > (assuming one transaction may include multiple messages below): > > 2) F

Re: [DISCUSS] Kafka Security Specific Features

2014-06-09 Thread Robert Withers
Yes, that sounds familiar as I helped write (minimally) S/MIME in squeak (open source Smalltalk environment). This what I was thinking in my alternative here, though I have a concern... Production may occur before the consumer is coded and executed. In the analogy of mail, the mail is sent be

Re: Hadoop Summit Meetups

2014-06-09 Thread Guozhang Wang
Robert, Thanks for the description. Just want to clarify on some of the points (assuming one transaction may include multiple messages below): 2) For the "one-to-one mapping" to work, does the consumer can only read at transaction boundaries, i.e., all or none messages are returned to the consume

Re: Getting the KafkaStream ID

2014-06-09 Thread Jun Rao
We do have a jmx that reports the lag per partition. You could probably get the lag that way. Then, you just need to slow down the iteration on the fast partition. Thanks, Jun On Mon, Jun 9, 2014 at 4:07 AM, Bogdan Dimitriu (bdimitri) < bdimi...@cisco.com> wrote: > Certainly. > I know this may

Re: [DISCUSS] Kafka Security Specific Features

2014-06-09 Thread Todd Palino
It’s the same method used by S/MIME and many other encryption specifications with the potential for multiple recipients. The sender generates a session key, and uses that key to encrypt the message. The session key is then encrypted once for each recipient with that recipient’s public key. All of t

Re: Getting the KafkaStream ID

2014-06-09 Thread Robert Hodges
Hi Bogdan, It sounds as if you could implement a form of signaling between the consumers using a distributed barrier. This can be implemented using Kafka topics. For example you could create a control thread that posts the current high-water mark for all consumers into a special topic, which give

Re: Getting the KafkaStream ID

2014-06-09 Thread Bogdan Dimitriu (bdimitri)
Certainly. I know this may not sound like a great idea but I am running out of options here: I¹m basically trying to implement a consumer throttle. My application consumes from a fairly high number of partitions from a number of consumer servers. The data is put in the partitions by a producer in a