Re: Cache Memory Kafka Process

2015-07-27 Thread Nilesh Chhapru
Hi Ewen, I am using 3 brokers with 12 topic and near about 120-125 partitions without any replication and the message size is approx 15MB/message. The problem is when the cache memory increases and reaches to the max available the performance starts degrading also i am using Storm spot as consume

Number of kafka topics/partitions supported per cluster of n nodes

2015-07-27 Thread Prabhjot Bharaj
Hi, I'm looking forward to a benchmark which can explain how many total number of topics and partitions can be created in a cluster of n nodes, given the message size varies between x and y bytes and how does it vary with varying heap sizes and how it affects the system performance. e.g. the resu

Re: Cache Memory Kafka Process

2015-07-27 Thread Daniel Compton
http://www.linuxatemyram.com may be a helpful resource to explain this better. On Tue, 28 Jul 2015 at 5:32 AM Ewen Cheslack-Postava wrote: > Having the OS cache the data in Kafka's log files is useful since it means > that data doesn't need to be read back from disk when consumed. This is > good

Re: multiple producer throughput

2015-07-27 Thread Yuheng Du
The message size is 100 bytes and each producer sends out 50million messages. It's the number used by the "benchmarking kafka" post. http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Thanks. On Mon, Jul 27, 2015 at 4:15 PM, Prabhjot Bhara

Re: multiple producer throughput

2015-07-27 Thread Prabhjot Bharaj
Hi, Have you tried with acks=1 and -1 as well? Please share the numbers and the message size Regards, Prabcs On Jul 27, 2015 10:24 PM, "Yuheng Du" wrote: > Hi, > > I am running 40 producers on 40 nodes cluster. The messages are sent to 6 > brokers in another cluster. The producers are running P

Re: Controlled Shutdown Tool?

2015-07-27 Thread Andrew Otto
Ah, thank you, SIGTERM is what I was looking for. The docs are unclear on that, it would be useful to fix those. Thanks! > On Jul 27, 2015, at 14:59, Binh Nguyen Van wrote: > > You can initiate controlled shutdown by run bin/kafka-server-stop.sh. This > will send a SIGTERM to broker to tell

Re: Controlled Shutdown Tool?

2015-07-27 Thread Binh Nguyen Van
You can initiate controlled shutdown by run bin/kafka-server-stop.sh. This will send a SIGTERM to broker to tell it to do the controlled shutdown. I also got confused before and had to look at the code to figure that out. I think it is better if we can add this to the document. -Binh On Mon, Jul

Re: Controlled Shutdown Tool?

2015-07-27 Thread Sriharsha Chintalapani
controlled.shutdown built into broker when this config set to true it makes request to controller to initiate the controlled shutdown, waits till the request is succeeded and incase of failure retries the shutdown   controlled.shutdown.max.retries times. https://github.com/apache/kafka/blob/0.8.

Re: Controlled Shutdown Tool?

2015-07-27 Thread Andrew Otto
Thanks! But how do I initiate a controlled shutdown on a running broker? Editing server.properties is not going to cause this to happen. Don’t I have to tell the broker to shutdown nicely? All I really want to do is tell the controller to move leadership to other replicas, so I can shutdown

Re: Controlled Shutdown Tool?

2015-07-27 Thread Sriharsha Chintalapani
You can set controlled.shutdown.enable to true in kafka’s server.properties  , this is enabled by default in 0.8.2 on wards and also you can set max retries using controlled.shutdown.max.retries defaults to 3 . Thanks, Harsha On July 27, 2015 at 11:42:32 AM, Andrew Otto (ao...@wikimedia.org)

Controlled Shutdown Tool?

2015-07-27 Thread Andrew Otto
I’m working on packaging 0.8.2.1 for Wikimedia, and in doing so I’ve noticed that kafka.admin.ShutdownBroker doesn’t exist anymore. From what I can tell, this has been intentionally removed in favor of a JMX(?) config “controlled.shutdown.enable”. It is unclear from the documentation how one i

Re: Log Deletion Behavior

2015-07-27 Thread Mayuresh Gharat
Hi Jiefu, The topic will stay forever. You can do delete topic operation to get rid of the topic. Thanks, Mayuresh On Mon, Jul 27, 2015 at 11:19 AM, JIEFU GONG wrote: > Mayuresh, > > Yes, it seems like I misunderstood the behavior of log deletion but indeed > my log segments were deleted afte

Re: Log Deletion Behavior

2015-07-27 Thread JIEFU GONG
Mayuresh, Yes, it seems like I misunderstood the behavior of log deletion but indeed my log segments were deleted after a specified amount of time. I have a small follow-up question, it seems that when the logs are deleted the topic persists and can be republished too -- is there a configuration f

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
Thank you! On Mon, Jul 27, 2015 at 1:43 PM, Ewen Cheslack-Postava wrote: > As I mentioned, adjusting any settings such that files are small enough > that you don't get the benefits of append-only writes or file > creation/deletion become a bottleneck might affect performance. It looks > like the

Re: deleting data automatically

2015-07-27 Thread Ewen Cheslack-Postava
As I mentioned, adjusting any settings such that files are small enough that you don't get the benefits of append-only writes or file creation/deletion become a bottleneck might affect performance. It looks like the default setting for log.segment.bytes is 1GB, so given fast enough cleanup of old l

Re: Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Ewen Cheslack-Postava
Hi Prabhjot, Confluent has a REST proxy with docs that may give some guidance: http://docs.confluent.io/1.0/kafka-rest/docs/intro.html The new producer that it uses is very efficient, so you should be able to get pretty good throughput. You take a bit of a hit due to the overhead of sending data t

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
Thank you! what performance impacts will it be if I change log.segment.bytes? Thanks. On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava wrote: > I think log.cleanup.interval.mins was removed in the first 0.8 release. It > sounds like you're looking at outdated docs. Search for > log.retenti

Re: Cache Memory Kafka Process

2015-07-27 Thread Ewen Cheslack-Postava
Having the OS cache the data in Kafka's log files is useful since it means that data doesn't need to be read back from disk when consumed. This is good for the latency and throughput of consumers. Usually this caching works out pretty well, keeping the latest data from your topics in cache and only

Re: deleting data automatically

2015-07-27 Thread Ewen Cheslack-Postava
I think log.cleanup.interval.mins was removed in the first 0.8 release. It sounds like you're looking at outdated docs. Search for log.retention.check.interval.ms here: http://kafka.apache.org/documentation.html As for setting the values too low hurting performance, I'd guess it's probably only an

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
If I want to get higher throughput, should I increase the log.segment.bytes? I don't see log.retention.check.interval.ms, but there is log.cleanup.interval.mins, is that what you mean? If I set log.roll.ms or log.cleanup.interval.mins too small, will it hurt the throughput? Thanks. On Fri, Jul 2

multiple producer throughput

2015-07-27 Thread Yuheng Du
Hi, I am running 40 producers on 40 nodes cluster. The messages are sent to 6 brokers in another cluster. The producers are running ProducerPerformance test. When 20 nodes are running, the throughput is around 13MB/s and when running 40 nodes, the throughput is around 9MB/s. I have set log.reten

Re: New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-27 Thread Jason Gustafson
Hey Stevo, I agree that it's a little unintuitive that what you are committing is the next offset that should be read from and not the one that has already been read. We're probably constrained in that we already have a consumer which implements this behavior. Would it help if we added a method on

Re: Log Deletion Behavior

2015-07-27 Thread Mayuresh Gharat
Hi Jiefu, Any update on this? Were you able to delete those log segments? Thanks, Mayuresh On Fri, Jul 24, 2015 at 7:14 PM, Mayuresh Gharat wrote: > To add on, the main thing here is you should be using only one of these > properties. > > Thanks, > > Mayuresh > > On Fri, Jul 24, 2015 at 6:47

Java API for fetching Consumer group from Kafka Server(Not Zookeeper)

2015-07-27 Thread swati.suman2
Hi Jiangjie, kafka.admin.ConsumerGroupCommand is a scala class. Could you please tell me some Java API for fetching consumer group from Kafka server. Best Regards, Swati Suman The information contained in this electronic message and any attachments to this message are intended for the exclu

Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Prabhjot Bharaj
Hi Folks, I would like to understand the best practices when using kafka as the source-of-truth, given the fact that I want to pump in data to Kafka using http methods. What are the current production configurations for such a use case:- 1. Kafka-http-client - is it scalable the way Nginx is ??

Cache Memory Kafka Process

2015-07-27 Thread Nilesh Chhapru
Hi All, I am facing issues with kafka broker process taking a lot of cache memory, just wanted to know if the process really need that much of cache memory, or can i clear the OS level cache by setting a cron. Regards, Nilesh Chhapru.

Re: Choosing brokers when creating topics

2015-07-27 Thread Jilin Xie
Hi Even Thanks for your reply. I've been using the kafka-reassign-partition tool. But --replica-assignment is exactly what I'm looking for. Thanks On Mon, Jul 27, 2015 at 3:58 PM, Ewen Cheslack-Postava wrote: > Try the --replica-assignment option for kafka-topics.s

Re: Choosing brokers when creating topics

2015-07-27 Thread Ewen Cheslack-Postava
Try the --replica-assignment option for kafka-topics.sh. It allows you to specify which brokers to assign as replicas instead of relying on the assignments being made automatically. -Ewen On Mon, Jul 27, 2015 at 12:25 AM, Jilin Xie wrote: > Hi > Is it possible to choose which brokers to u

Choosing brokers when creating topics

2015-07-27 Thread Jilin Xie
Hi Is it possible to choose which brokers to use when creating a topic? The general command of creating topic is: *bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test* What I'm looking for is: *bin/kafka-topics.sh --create . "--br