Re: Maximum number of producers per topic per broker

2016-07-26 Thread Ismael Juma
Hi, It would be great if you could file a JIRA with additional details regarding the issue so that we could investigate if something could be done about it. Ismael On 24 Jul 2016 17:32, "Dodong Juan" wrote: > Just found out what's was causing this. Which is quite dangerous. The > additional 10

Re: Kafka Connect issues

2016-07-26 Thread Kristoffer Sjögren
We found very high cpu usage which might cause the problem. Seems to be spending a lot of cycles querying and parsing hdfs paths? Den 24 jul 2016 02:40 skrev "Ewen Cheslack-Postava" : That definitely sounds unusual -- rebalancing normally only happens either when a) there are new workers or b) the

Re: consumer.subscribe(Pattern p , ..) method fails with Authorizer

2016-07-26 Thread Ismael Juma
Ewen, that's right and that is being handled in https://github.com/apache/kafka/pull/1428. On Sun, Jul 24, 2016 at 1:41 AM, Ewen Cheslack-Postava wrote: > Manikumar, > > Yeah, that seems bad. Seems like maybe instead of moving to server-side > processing we should make the metadata request limit

Kafka 0.9.0.1 failing on new leader election

2016-07-26 Thread Sean Morris (semorris)
I have a setup with 2 brokers and it is going through leader re-election but seems to fail to complete. The behavior I start to see is that some published succeed but others will fail with NotLeader exceptions like this java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.No

How to get the Global IP address of the Producer?

2016-07-26 Thread Gourab Chowdhury
Hi, I am working on a project that includes Apache Kafka. My consumer is a Apache Spark and I am try to log the Global IP from where the message was generated, to verify from which device/system? Ofcourse Kafka know the IP of the producer, since it can read the packets. Can it append it in the me

Fwd: Kafka (10.0) can NOT recover the old messages in a new topic, after i start a broker again.

2016-07-26 Thread Sergio Gonzalez
Hello, I hope that you are ok. I have an issue with kafka and l would like that you could help me. I have three brokers setting, i also have one partition by topic with two factor replication. I created a java application that create topics and have producers that send the messages and consumer t

Streaming Application Design with Database Lookups

2016-07-26 Thread David Garcia
Hello, we are working on designs for several streaming applications and a common consideration is the need for occasional external database updates/lookups. For example…we would be processing a stream of events with some kind of local-id, and we occasionally need to resolve the local-id to a g

Kafka Streams multi-node

2016-07-26 Thread Davood Rafiei
Hi, I am newbie in Kafka and Kafka-Streams. I read documentation to get information how it works in multi-node environment. As a result I want to run streams library on cluster that consists of more than one node. >From what I understood, I try to resolve the following conflicts: - Streams is a st

TopicMetadata Question

2016-07-26 Thread Chris Barlock
Starting with Kafka 0.8.2.1, we have some code that call AdminUtils.fetchTopicMetadataFromZk. This returned a kafka.api.TopicMetadata object which provided some useful information about the topic. With 0.10.0.0, fetchTopicMetadataFromZk returns org.apache.kafka.common.requests.MetadataRespons

Re: Kafka Streams multi-node

2016-07-26 Thread David Garcia
http://docs.confluent.io/3.0.0/streams/architecture.html#parallelism-model you shouldn’t have to do anything. Simply starting a new thread will “rebalance” your streaming job. The job coordinates with tasks through kafka itself. On 7/26/16, 12:42 PM, "Davood Rafiei" wrote: Hi,

Re: TopicMetadata Question

2016-07-26 Thread Radoslaw Gruchalski
Chris, There is a .topic() method available on that object: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/MetadataResponse.java#L323 – Best regards, Radek Gruchalski ra...@gruchalski.com de.linkedin.com/in/radgruchalski *Confidentiality:*This

Handling long commits during a rebalance

2016-07-26 Thread Joey Echeverria
We've been playing around with the new Consumer API and have it an unfortunate bump in the road. When our onPartitionsRevoked() callback is called we'd like to be able to commit any data that we were processing to stable storage so we can then commit the offsets back to Kafka. This way we don't thr

Re: Kafka Streams multi-node

2016-07-26 Thread Matthias J. Sax
David's answer is correct. Just start the same application multiple times on different nodes and the library does the rest for you. Just one addition: as Kafka Streams is for standard application development, there is no need to run the application on the same nodes as your brokers are running (ie

kafka latency

2016-07-26 Thread Luo, Chao
Dear Kafka guys, I measured the latency of kafka system. The producer, kafka servers, and consumer were running on different machines (AWS EC2 instances). The producer-to-consumer latency is around 250 milliseconds. Is it a normal value for kafka system. Can I do better and how? Any comments or

Re: TopicMetadata Question

2016-07-26 Thread Chris Barlock
Thank you Radek, but that is also not very interesting. topic() returns just the topic name, If there is no "easy" equivalent to getting the 0.8 kafka.api.TopicMetadata, it looks like I could get some information from the PartitionMetadata that this TopicMetadata holds. Chris IBM Middleware

Kafka ack 0 is giving less trought put then ack=all

2016-07-26 Thread Nagu Kothapalli
Hi All, Kafka version : 0.9.1 I am running some kind of bench marking tests on Kafka run time (Producer) . noticed that ack=0 giving less throughput compared to ack=all Please let me know if you have any idea on this

Re: Kafka ack 0 is giving less trought put then ack=all

2016-07-26 Thread Vahid S Hashemian
Could it be related to this bug: https://issues.apache.org/jira/browse/KAFKA-3129? --Vahid From: Nagu Kothapalli To: users@kafka.apache.org Date: 07/26/2016 01:52 PM Subject:Kafka ack 0 is giving less trought put then ack=all Hi All, Kafka version : 0.9.1 I am running so

Re: Kafka Streams multi-node

2016-07-26 Thread Davood Rafiei
Thanks David and Matthias for reply. To make sure that I understand correctly: - Each stream application is limited to only one node. In that node all stream execution DAG is processed. - If we want to parallelize our application, we start new instance of streams application, which will be similar

Re: Kafka Streams multi-node

2016-07-26 Thread Matthias J. Sax
See inline. -Matthias On 07/26/2016 11:10 PM, Davood Rafiei wrote: > Thanks David and Matthias for reply. To make sure that I understand > correctly: > - Each stream application is limited to only one node. In that node all > stream execution DAG is processed. An application is not limited to a

Re: Kafka Streams: Merging of partial results

2016-07-26 Thread Guozhang Wang
Just a few additions to what Eno said: if you are using the Streams DSL to code your applications, and you are aggregating based on the non-key of the input stream, the re-partitioning will be automatically conducted with the auto-generated internal repartition topic. So you are guaranteed that the

Re: Kafka (Streams) scalability

2016-07-26 Thread Guozhang Wang
As for KS, you can think of each single-threaded KS instance as a normal producer plus a normal consumer client, so you can do the math for capacity planning purposes assuming you understand your application traffic, and your state store write amplifications (if you use the default persistent key-v

Re: Streaming Application Design with Database Lookups

2016-07-26 Thread Guozhang Wang
Hello David, There is another remote store IO saving idea from a recent paper: https://scontent.xx.fbcdn.net/t39.2365-6/13331599_975087972607457_1796386216_n/Realtime_Data_Processing_at_Facebook.pdf "A monoid is an algebraic structure that has an identity element and is associative. When a monoi

Re: Handling long commits during a rebalance

2016-07-26 Thread craig w
We had to have one thread use the Consumer to poll and get records, it would then put them into a blocking queue (in memory), pause our subscription, have separate threads pull work from the blocking queue. meanwhile the thread with the consumer would keep calling "poll" (getting no data back b/c w

Re: Handling long commits during a rebalance

2016-07-26 Thread Joey Echeverria
That's the direction we're looking at for normal commit processing, but how do you handle commits during a rebalance? Namely, do you initiate a commit during a call to onPartitionsRevoked? -Joey On Tue, Jul 26, 2016 at 5:51 PM, craig w wrote: > We had to have one thread use the Consumer to poll

Re: kafka latency

2016-07-26 Thread Stevo Slavić
Hello Chao, How did you measure latency? See also https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Kind regards, Stevo Slavic. On Tue, Jul 26, 2016 at 9:52 PM, Luo, Chao wrote: > Dear Kafka guys, > > I measured the latency of kafka