Re: large amount of disk space freed on restart

2013-05-23 Thread Jun Rao
I haven't seen this issue before. We do have ~1K topics in one of the Kafka clusters at LinkedIn. Thanks, Jun On Thu, May 23, 2013 at 11:05 AM, Jason Rosenberg wrote: > Yeah, that's what it looks like to me (looking at the code). So, I'm > guessing it's some os level caching, resource recycl

Re: Partitioning and scale

2013-05-23 Thread Milind Parikh
Number of files to manage by os, I suppose. Why wouldn't you use consistent hashing with deliberately engineered collisions to generate a limited number of topics / partitions and filter at the consumer level? Regards Milind On May 23, 2013 4:22 PM, "Timothy Chen" wrote: > Hi Neha, > > Not sure

Re: Partitioning and scale

2013-05-23 Thread Timothy Chen
Hi Neha, Not sure if this sounds crazy, but if we'd like to have the events for the same session id go to the same partition one way could be that each session key creates its own topic with single partition, therefore there could be millions of topic with single partition. I wonder what would be

Re: consumer offset not saved in zk

2013-05-23 Thread rk vishu
Neha, I see the point. I verified zkclient version from kafka build and found that it is 0.2. i updated my client app's POM to include the following (corrected from 0.1) com.101tec zkclient 0.2 Thank you very much for the inputs and help. On Thu, May 23, 2013 at 3:23 PM, Neha N

Re: consumer offset not saved in zk

2013-05-23 Thread Neha Narkhede
Can you please try the following - ./sbt clean assembly-package-dependency package Thanks, Neha On Thu, May 23, 2013 at 3:16 PM, rk vishu wrote: > Neha, > > Thanks for pointing out the log4j. I turned on logs at INFO level. Now i > see some warnings as below. > > WARN [Kafka-consumer-autocom

Re: consumer offset not saved in zk

2013-05-23 Thread rk vishu
Neha, Thanks for pointing out the log4j. I turned on logs at INFO level. Now i see some warnings as below. WARN [Kafka-consumer-autocommit-1] (Logging.scala:88) - [1_BELC02K41GGDKQ4.sea.corp.expecn.com-1369346576173-22148419], exception during commitOffsets java.lang.NoSuchMethodError: org.I0Itec

Re: More Kafka Benchmarking Goodness

2013-05-23 Thread Neha Narkhede
Thanks for sharing this, Jason. I had performed similar benchmarks back in 2011 (Results are on slide 27 here - https://cwiki.apache.org/confluence/download/attachments/2786/F_1330_Narkhede_Kafka+%281%29.pptx?version=1&modificationDate=1358352559000). My tests saturated the 1Gb network link bet

Re: consumer offset not saved in zk

2013-05-23 Thread Neha Narkhede
You don't want to override the default configs. Also, seems like something else is wrong with your setup ? Could you share the log4j logs of your consumer ? Meanwhile, can you try if you can use the console consumer successfully ? Thanks, Neha On Thu, May 23, 2013 at 2:31 PM, rk vishu wrote: >

Re: heterogenous kafka cluster?

2013-05-23 Thread Maxime Brugidou
Have you thought about integrating Kafka into a distributed resource management framework like Hadoop YARN (which would probably leverage HDFS) or Mesos? On May 23, 2013 11:31 PM, "Neha Narkhede" wrote: > This paper talks about how to do that - > http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf > It

Re: consumer offset not saved in zk

2013-05-23 Thread rk vishu
My ZK directory listing is as below. Looks like offsets path is not even created. zk: localhost:2181(CONNECTED) 0] ls / [hadoop-ha, hbase, zookeeper, consumers, controller, storm, brokers, controller_epoch] [zk: localhost:2181(CONNECTED) 1] ls /consumers [1, das-service] [zk: localhost:2181(CONNEC

Re: heterogenous kafka cluster?

2013-05-23 Thread Neha Narkhede
This paper talks about how to do that - http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf It will be interesting to see what part of it Kafka can adopt, if any. Thanks, Neha On Fri, May 17, 2013 at 11:28 PM, Jason Rosenberg wrote: > Letting each broker have a weight sounds like a great idea. > > S

Re: consumer offset not saved in zk

2013-05-23 Thread rk vishu
Neha, below are my properties. I tried adding consumer.timeout.ms=3000 or 1 also. Properties props = new Properties(); props.put("zookeeper.connect", a_zookeeper); props.put("group.id", "1"); props.put("zookeeper.session.timeout.ms", "4000"); props.put("zookee

Re: Failed to launch 0.8 stable server

2013-05-23 Thread Neha Narkhede
Even if they are wrong, were you pointing to a zookeeper cluster where another Kafka cluster was started up ? Thanks Neha On Thu, May 23, 2013 at 7:53 AM, Yu, Libo wrote: > I will answer my own question. It turns out the zookeeper addresses in > server.properties are > wrong. But it is not ea

Re: More Kafka Benchmarking Goodness

2013-05-23 Thread Oleg Ruchovets
Can you please share Kafka version you've used for the tests? On Thu, May 23, 2013 at 8:57 PM, Jason Weiss wrote: > Folks, > > As I posted to the group here yesterday, my 3 server test in AWS produced > an average of 273,132 events per second with a fixed-size 2K message > payload. (Please see

Re: large amount of disk space freed on restart

2013-05-23 Thread Jason Rosenberg
Yeah, that's what it looks like to me (looking at the code). So, I'm guessing it's some os level caching, resource recycling. Have you ever heard of this happening? One thing that might be different in my usage from the norm is a relatively large number of topics (e.g. ~2K topics). Jason On T

More Kafka Benchmarking Goodness

2013-05-23 Thread Jason Weiss
Folks, As I posted to the group here yesterday, my 3 server test in AWS produced an average of 273,132 events per second with a fixed-size 2K message payload. (Please see that thread for details.) In order to determine the horizontal scalability, I added an additional server and MORE producer c

Re: orders of launching kafka servers and zookeepers

2013-05-23 Thread Marc Labbe
For the sake of the discussion and for others reading this... In a live/production environment, I guess it is safe to say that if ZK is down for any period of time, the best bet is to also stop Kafka and restart it once ZK is back up? On Thu, May 23, 2013 at 12:26 PM, Neha Narkhede wrote: > If

Re: kafka 0.8 - unable to run most scala test cases

2013-05-23 Thread Rob Withers
Yes, Neha. Balaji figured out the issue. src/test/scala was setup as a source folder, when the sbt plugin approach to developer setup is used. This is incorrect. It ought to be the src/test/scala/unit that is setup as a source folder. All 131 tests pass green, now. thanks, rob On May 23,

Re: kafka 0.8 - unable to run most scala test cases

2013-05-23 Thread Neha Narkhede
You can find all unit tests under core/test. Just look for *Test.scala. Thanks, Neha On May 23, 2013 9:59 AM, "Rob Withers" wrote: > I am using 0.8 in eclipse. I used the approach suggested to use the sbt > plugin and that works great (thank you to whomever recommended that). It > pulled Scala

RE: eclipse project/classpath files for 0.8?

2013-05-23 Thread Withers, Robert
Ahh, yes, Andrea - it was you who suggested to me to use the sbt plugin approach. I finally got around to it and this worked perfectly the first time, other than the scala unit tests. Thanks very much for explaining it to me. Thanks, rob -Original Message- From: Andrea Gazzarini [m

kafka 0.8 - unable to run most scala test cases

2013-05-23 Thread Rob Withers
I am using 0.8 in eclipse. I used the approach suggested to use the sbt plugin and that works great (thank you to whomever recommended that). It pulled Scala 2.9.3. However, I am only able to run 2 common scala tests: TopicTest and ConfigTest. How can I find and run the other scala tests? t

Re: orders of launching kafka servers and zookeepers

2013-05-23 Thread Neha Narkhede
If you merely rolling bounce a zookeeper cluster while keeping a quorum, Kafka will recover automatically. Thanks, Neha On May 23, 2013 9:21 AM, "Marc Labbe" wrote: > Thanks for the answer, I was looking for this information on my side as > well. > > If, for some reason, the ZK cluster restarts

Re: orders of launching kafka servers and zookeepers

2013-05-23 Thread Marc Labbe
Thanks for the answer, I was looking for this information on my side as well. If, for some reason, the ZK cluster restarts completely, how should we deal with Kafka? Should we restart it, stop it before the ZK restart or will Kafka recover automatically? This is mainly a question for a constantly

Re: Offset in high level consumer

2013-05-23 Thread Neha Narkhede
The other option is jmx bean that exposes the lag. Also Kafka provides at least once guarantees so even if your consumer lags occasionally, you will eventually receive all mesaages. You need to provision enough consumers so that they don't fall behind. Thanks, Neha On May 23, 2013 5:30 AM, "arathi

Re: are topics and partitions dynamic?

2013-05-23 Thread Neha Narkhede
You can specify it in server.properties and set it to true. For more information on 08 configs, look here http://kafka.apache.org/08/configuration.html Thanks, Neha On May 23, 2013 8:04 AM, "Noel Golding" wrote: > I am currently using kafka-0.8.0. I don't see a reference to > auto.create.topics

Re: are topics and partitions dynamic?

2013-05-23 Thread Noel Golding
I am currently using kafka-0.8.0. I don't see a reference to auto.create.topics.enable in server.properties. Can you tell me if that is the file where I should be adding this property? Also what are the acceptable values? i.e 1,0,true,false, etc. Thanks in advance -Noel

RE: Failed to launch 0.8 stable server

2013-05-23 Thread Yu, Libo
I will answer my own question. It turns out the zookeeper addresses in server.properties are wrong. But it is not easy to tell from the error message. Regards, Libo From: Yu, Libo [ICG-IT] Sent: Thursday, May 23, 2013 10:48 AM To: 'users@kafka.apache.org' Subject: Failed to launch 0.8 stable s

Failed to launch 0.8 stable server

2013-05-23 Thread Yu, Libo
Hi, After the zookeepers were up, when I tired to launch kafka server, I got this error: [2013-05-23 10:45:24,072] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) java.lang.RuntimeException: A broker is already registered on the path /

Re: large amount of disk space freed on restart

2013-05-23 Thread Jun Rao
Jason, Kafka closes the handler of all delete files. Otherwise, the broker will run out of file handler quickly. Thanks, Jun On Wed, May 22, 2013 at 10:17 PM, Jason Rosenberg wrote: > So, does this indicate kafka (or the jvm itself) is not aggressively > closing file handles of deleted files

Re: Apache Kafka in AWS

2013-05-23 Thread Jason Weiss
Bummer. Yes, but it will be several days. I'll post back to the forum with a URL once I'm done. Jason On 5/23/13 10:11 AM, "Jun Rao" wrote: >Jason, > >Unfortunately, Apache mailing lists don't support attachments. Could you >document your experience (with the graphs) in a blog (or a wiki pag

Re: Apache Kafka in AWS

2013-05-23 Thread Jun Rao
Jason, Unfortunately, Apache mailing lists don't support attachments. Could you document your experience (with the graphs) in a blog (or a wiki page in Kafka)? Thanks, Jun On Thu, May 23, 2013 at 2:00 AM, Jason Weiss wrote: > Jun, > > Here is a screenshot from AWS's statistics (per-minute sa

Re: consumer offset not saved in zk

2013-05-23 Thread Jun Rao
You are looking at the wrong path in ZK. The correct path for consumer offset is /consumers/[groupId]/offsets/[topic]/[partitionId] -> long (offset). For more details on our ZK layout, see https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper Thanks, Jun On Wed, M

Re: consumer offset not saved in zk

2013-05-23 Thread Neha Narkhede
I suspect you had auto.commit.enable=false and consumer.timeout.ms=1. Can you confirm the values for the above configs in your example? Thanks, Neha On May 22, 2013 11:22 PM, "rk vishu" wrote: > Hello All, > > I recently started experimenting Kafka for my usecase. I am running 0.8 in > two n

Re: Offset in high level consumer

2013-05-23 Thread arathi maddula
Hi Neha, Thanks for the quick reply. Could you tell me if there is some way of determining the offset for a consumer from a high level Java consumer class apart from ConsumerOffsetChecker tool? This tool can be run only from the command line. Is it possible to use this in a Java class? I write st

RE: Apache Kafka in AWS

2013-05-23 Thread Jason Weiss
Jun, Here is a screenshot from AWS's statistics (per-minute sampling is the finest granularity I believe that they chart). I don't have a screenshot of the top output. This shows when I added a 4th machine to the cluster with the same number of clients, my CPU utilization fell- but remained co