custom serializer and deserializer

2015-02-25 Thread ankit tyagi
Hi, I want to use protobuff for serializing and deserializing kafkaevents in .8.2.0. I can provide my custom serializer by setting KEY_SERIALIZER_CLASS_CONFIG and VALUE_SERIALIZER_CLASS_CONFIG. but how can I provide custom Deserializer.

non-blocking sends when cluster is down

2015-02-25 Thread Gary Ogden
Say the entire kafka cluster is down and there's no brokers to connect to. Is it possible to use the java producer send method and not block until there's a timeout? Is it as simple as registering a callback method? We need the ability for our application to not have any kind of delay when sendin

Re: "at least once" consumer recommendations for a load of 5 K messages/second

2015-02-25 Thread Anand Somani
Sweet! that I would not depend on ZK more consumption anymore. Thanks for the response Gwen, I will take a look at the link you have provided. >From what I have read so far, for my scenario to work correctly I would have multiple partitions and a consumer per partition, is that correct? So for me

Re: "at least once" consumer recommendations for a load of 5 K messages/second

2015-02-25 Thread Gwen Shapira
I don't have good numbers, but I noticed that I usually scale number of partitions by the consumer rates and not by producer rate. Writing to HDFS can be a bit slow (30MB/s is pretty typical, IIRC), so if I need to write 5G a second, I need at least 15 consumers, which means at least 15 partitions

Announcing the Confluent Platform built on Apache Kafka

2015-02-25 Thread Neha Narkhede
Folks, We, at Confluent , are excited to announce the release of Confluent Platform 1.0 built around Apache Kafka - http://blog.confluent.io/2015/02/25/announcing-the-confluent-platform-1-0/ We also published a detailed two-part guide on how you can put Kafka to use in your o

generate specific throughput load

2015-02-25 Thread Josh J
Hi, Is there a way to generate a specified amount of throughput? I'm using the Stats class here to measure the throughput. Though I need to

Re: generate specific throughput load

2015-02-25 Thread Magnus Edenhill
Hi, the rdkafka_performance tool from librdkafka's examples [1] lets you do this with something like: rdkafka_performance -P -b -t [-p ] -r -s Thats the producer side, if you want performance measurements on the consumer side as well you do: rdkafka_performance -C -b -t -p -o There is an

Re: Kafka High Level Consumer

2015-02-25 Thread Joseph Lawson
Doh that was probably my bad Pranay! A misinterpretation of some old consumer code. btw, jruby-kafka is now at 1.1.1 with proper support for deleting the offset, setting the auto_offset_reset and whitelist/blacklist topics. It's packed up in a nice gem file that includes all Kafka and log4j p

Tips for working with Kafka and data streams

2015-02-25 Thread Jay Kreps
Hey guys, One thing we tried to do along with the product release was start to put together a practical guide for using Kafka. I wrote this up here: http://blog.confluent.io/2015/02/25/stream-data-platform-1/ I'd like to keep expanding on this as good practices emerge and we learn more stuff. So

Re: Announcing the Confluent Platform built on Apache Kafka

2015-02-25 Thread Joseph Lawson
This is really awesome stuff. It's great to see y'all growing! Thank you and congratulations! From: Neha Narkhede Sent: Wednesday, February 25, 2015 1:31 PM To: users@kafka.apache.org; d...@kafka.apache.org Subject: Announcing the Confluent Platform bui

Re: Announcing the Confluent Platform built on Apache Kafka

2015-02-25 Thread Andrew Otto
Wow, .deb packages. I love you. > On Feb 25, 2015, at 14:48, Joseph Lawson wrote: > > This is really awesome stuff. It's great to see y'all growing! Thank you > and congratulations! > > > From: Neha Narkhede > Sent: Wednesday, February 25, 2015 1:3

Re: generate specific throughput load

2015-02-25 Thread Jiangjie Qin
There is this ProducerPerformance class coming with new java producer. You can go to KAFKA_HOME/bin and use the following command: ./kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance USAGE: java org.apache.kafka.clients.tools.ProducerPerformance topic_name num_records record_si

RE: Announcing the Confluent Platform built on Apache Kafka

2015-02-25 Thread Aditya Auradkar
Congrats! From: Andrew Otto [ao...@wikimedia.org] Sent: Wednesday, February 25, 2015 12:06 PM To: users@kafka.apache.org Cc: d...@kafka.apache.org Subject: Re: Announcing the Confluent Platform built on Apache Kafka Wow, .deb packages. I love you. > On

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Christian Csar
I wouldn't say no to some discussion of encryption. We're running on Azure EventHubs (with preparations for Kinesis for EC2, and Kafka for deployments in customer datacenters when needed) so can't just use disk level encryption (which would have its own overhead). We're putting all of our messages

Re: non-blocking sends when cluster is down

2015-02-25 Thread Guozhang Wang
Hi Gray, The Java producer will block on send() when the buffer is full and block.on.buffer.full = true ( http://kafka.apache.org/documentation.html#newproducerconfigs). If you set the config to false the send() call will throw a BufferExhaustedException which, in your case, can be caught and igno

Re: custom serializer and deserializer

2015-02-25 Thread Guozhang Wang
Only the consumer needs deserializer classes. The current Java consumer is still under development but when it is finished you will find the corresponding KEY_DESERIALIZER_CLASS_CONFIG / VALUE_DESERIALIZER_CLASS_CONFIG in ConsumerConfig. Guozhang On Wed, Feb 25, 2015 at 4:54 AM, ankit tyagi wrot

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Jay Kreps
Hey Christian, That makes sense. I agree that would be a good area to dive into. Are you primarily interested in network level security or encryption on disk? -Jay On Wed, Feb 25, 2015 at 1:38 PM, Christian Csar wrote: > I wouldn't say no to some discussion of encryption. We're running on Azur

Broker Exceptions

2015-02-25 Thread Zakee
Need to know if I should I be worried about this or ignore them. I see tons of these exceptions/warnings in the broker logs, not sure what causes them and what could be done to fix them. ERROR [ReplicaFetcherThread-3-5], Error for partition [TestTopic] to broker 5:class kafka.common.NotLeaderForP

Re: broker restart problems

2015-02-25 Thread Zakee
Do you have the property auto.leader.rebalance.enable=true set in brokers? Thanks -Zakee On Tue, Feb 24, 2015 at 11:47 PM, ZhuGe wrote: > Hi all:We have a cluster of 3 brokers(id : 0,1,2). We restart(simply use > stop.sh and start.sh in bin directory) broker 1. The broker started > successfully

Re: Broker Exceptions

2015-02-25 Thread Jiangjie Qin
These messages are usually caused by leader migration. I think as long as you don¹t see this lasting for ever and got a bunch of under replicated partitions, it should be fine. Jiangjie (Becket) Qin On 2/25/15, 4:07 PM, "Zakee" wrote: >Need to know if I should I be worried about this or ignore

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Tong Li
+2, these kind of articles coming from the ones who created Kafka always provide great value to Kafka users and developers. For my 2 cents, I would love to see one or two articles for developers who involved in Kafka development on the topics of how to develop test cases and how to run them, what

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Christian Csar
The questions we get from customers typically end up being general so we break out our answer into network level and on disk scenarios. On disk/at rest scenario may just be use full disk encryption at the OS level and Kafka doesn't need to worry about it. But documenting any issues around it would

Re: Broker Exceptions

2015-02-25 Thread Zakee
Thanks, Jiangjie. Yes, I do see under partitions usually shooting every hour. Anythings that I could try to reduce it? How does "num.replica.fetchers" affect the replica sync? Currently have configured 7 each of 5 brokers. -Zakee On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin wrote: > These me

Re: Broker Exceptions

2015-02-25 Thread Jiangjie Qin
I don’t think num.replica.fetchers will help in this case. Increasing number of fetcher threads will only help in cases where you have a large amount of data coming into a broker and more replica fetcher threads will help keep up. We usually only use 1-2 for each broker. But in your case, it looks

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Julio Castillo
Although full disk encryption appears to be an easy solution, in our case that may not be sufficient. For cases where the actual payload needs to be encrypted, the cost of encryption is paid by the consumer and producers. Further complicating the matter would be the handling of encryption keys, etc

RE: broker restart problems

2015-02-25 Thread ZhuGe
we did not have this setting in the property file, so it should be false. BTW, it this command means periodically invoking 'preferred replica leader election tool'?and how should i solve the "out of syn" problem of the broker? > Date: Wed, 25 Feb 2015 16:09:42 -0800 > Subject: Re: broker restart

Re: Tips for working with Kafka and data streams

2015-02-25 Thread Christian Csar
Yeah, we do have scenarios where we use customer specific keys so our envelopes end up containing key identification information for accessing our key repository. I'll certainly follow any changes you propose in this area with interest, but I'd expect that sort of centralized key thing to be fairly

Re: How to measure performance metrics

2015-02-25 Thread Otis Gospodnetic
Have a look at http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/ There are also various open-source projects. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wed, Feb 25, 2015 at 12:40 AM, Bhuvana Bas

Re: NetworkProcessorAvgIdlePercent

2015-02-25 Thread Jun Rao
Then you may want to consider increasing num.io.threads and num.network.threads. Thanks, Jun On Tue, Feb 24, 2015 at 7:48 PM, Zakee wrote: > Similar pattern for that too. Mostly hovering below. > > -Zakee > > On Tue, Feb 24, 2015 at 2:43 PM, Jun Rao wrote: > > > What about RequestHandlerAvgId

Re: NetworkProcessorAvgIdlePercent

2015-02-25 Thread Zakee
Well currently I have configured 14 thread for both io and network. Do you think we should consider more? Thanks -Zakee On Wed, Feb 25, 2015 at 6:22 PM, Jun Rao wrote: > Then you may want to consider increasing num.io.threads > and num.network.threads. > > Thanks, > > Jun > > On Tue, Feb 24, 20

Re: Latest offset is frozen

2015-02-25 Thread Stuart Reynolds
Doh! Was assuming there was only 1 partition... Need to read all the partitions. On Mon, Feb 23, 2015 at 3:21 PM, Jun Rao wrote: > Hmm, when the tail offset is frozen, does it freeze forever? Also, do you > get the same frozen offset if you run the GetOffsetShell command? > > Thanks, > > Jun > >