Re: SocketTimeoutException with kafka-producer-perf-test.sh

2015-08-21 Thread Prabhjot Bharaj
Hi, Never mind. I have solved this problem by referring to an earlier question, where Neha had suggested to use a higher value for request-timeout-ms Regards, prabcs On Fri, Aug 21, 2015 at 12:22 PM, Prabhjot Bharaj wrote: > Hello Folks, > > I'm using Kafka 0.8.2.1 with the default zookeeper b

kafka-producer-perf-test.sh - No visible difference between request-num-acks 1 and -1

2015-08-21 Thread Prabhjot Bharaj
Hi, I'm using Kafka 0.8.2.1 with the default zookeeper build that comes along the bundle I have setup a 5 machine cluster and on the same 5 machines, I'm also running zookeeper as well I am trying to see what is the maximum produce throughput I can get on this 5 node cluster I have created only

Re: SocketTimeoutException with kafka-producer-perf-test.sh

2015-08-21 Thread Prabhjot Bharaj
This link already had the answer I was searching for: http://grokbase.com/t/kafka/users/145vmz70cb/java-net-sockettimeoutexception-in-broker On Fri, Aug 21, 2015 at 3:03 PM, Prabhjot Bharaj wrote: > Hi, > > Never mind. I have solved this problem by referring to an earlier > question, where Neha

Re: Any drawbacks of running Kafka consumer in a web container

2015-08-21 Thread Hisham Mardam-Bey
I've done this (and still do) in production, works fine. On Thu, Aug 20, 2015 at 6:21 PM, Venkat K wrote: > Hello guys, > > Does anyone run Kafka consumer in a web container like Tomcat in their > production environments (processing millions of messages per hour). > > I am wondering if the web c

Re: is SSL support feature ready to use in kafka-truck branch

2015-08-21 Thread Ben Stopford
Hi Qi Trunk seems fairly stable. There are guidelines here which includes how to generate keys https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka Your server config needs these properties (also

Painfully slow kafka recovery

2015-08-21 Thread Jörg Wagner
Hey everyone, here's my crosspost from irc. Our setup: 3 kafka 0.8.2 brokers with zookeeper, powerful hardware (20 cores, 27 logdisks each). We use a handful of topics, but only one topic is utilized heavily. It features a replication of 2 and 600 partitions. Our issue: If one kafka was down

Re: Painfully slow kafka recovery

2015-08-21 Thread Gwen Shapira
By default, num.replica.fetchers = 1. This means only one thread per broker is fetching data from leaders. This means it make take a while for the recovering machine to catch up and rejoin the ISR. If you have bandwidth to spare, try increasing this value. Regarding "no data flowing into kafka" -

Re: Painfully slow kafka recovery

2015-08-21 Thread Rajasekar Elango
We are seeing same behavior in 5 broker cluster when losing one broker. In our case, we are losing broker as well as kafka data dir. Jörg Wagner, Are you losing just broker or kafka data dir as well? Gwen, We have also observed that latency of messages arriving at consumers goes up by 10x when

Re: Painfully slow kafka recovery

2015-08-21 Thread Gwen Shapira
I suspect that in general the broker may be busier since it needs to handle more partitions now, and the extra replication. It could be good to dig into the specifics of the latency - there's a request log that you can turn on, I believe. On Fri, Aug 21, 2015 at 10:18 AM, Rajasekar Elango wrote:

Raid vs individual disks

2015-08-21 Thread Prabhjot Bharaj
Hi, I've gone through the details mentioned about Raid and individual disks in the ops section of the documentation But, I would like to know what performance boost we can get with individual disks. Is anybody using Kafka with multiple disks or all are raid into 1 big disk ? Regards, Prabcs

Re: kafka-producer-perf-test.sh - No visible difference between request-num-acks 1 and -1

2015-08-21 Thread Tao Feng
Hi Prabhjot, Do you intent to use the old producer performance microbenchmark? Thanks, -Tao On Fri, Aug 21, 2015 at 2:41 AM, Prabhjot Bharaj wrote: > Hi, > > I'm using Kafka 0.8.2.1 with the default zookeeper build that comes along > the bundle > > I have setup a 5 machine cluster and on the s

My emails don't seem to go through

2015-08-21 Thread Rajiv Kurian
Wondering why my emails to the mailing list don't go through.

Re: Raid vs individual disks

2015-08-21 Thread Todd Palino
At LinkedIn, we are using a RAID-10 of 14 disks. This is using software RAID. I recently did some performance testing with RAID 0, 5, and 6. I found that 5 and 6 underperformed significantly, possibly due to the parity calculations. RAID 0 had a sizable performance gain over 10, and I would expect

Re: My emails don't seem to go through

2015-08-21 Thread David Luu
I got your email from the list? On Fri, Aug 21, 2015 at 1:56 PM, Rajiv Kurian wrote: > Wondering why my emails to the mailing list don't go through. > -- David Luu Member of Technical Staff Mist Systems, Inc. 1601 S. De Anza Blvd. #248 Cupertino, CA 95014

Re: Raid vs individual disks

2015-08-21 Thread Chi Hoang
We are running with a JBOD configuration, and it is not recommended for the following reasons: - any volume failure causes an unclean shutdown and requires lengthy recovery - data is not distributed consistently across volumes, so you could have skew within a broker We are planning to switch to a

Re: My emails don't seem to go through

2015-08-21 Thread Rajiv Kurian
Weird I don't see my emails on the mailing list archive. On Fri, Aug 21, 2015 at 2:13 PM, David Luu wrote: > I got your email from the list? > > On Fri, Aug 21, 2015 at 1:56 PM, Rajiv Kurian > wrote: > > > Wondering why my emails to the mailing list don't go through. > > > > > > -- > David Luu

Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
We upgraded a 9 broker cluster from version 0.8.1 to version 0.8.2.1. Actually we cherry-picked the commit at 41ba26273b497e4cbcc947c742ff6831b7320152 to get zkClient 0.5 because we ran into a bug described at https://issues.apache.org/jira/browse/KAFKA-824 Right after the update the CPU spiked q

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
The only thing I notice in the logs which is a bit unsettling is about a once a second rate of messages of the type "Closing socket connection to some-ip-address". I used to see these messages before but it seems like its more often than usual. Also all the clients that it seems to close connectio

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Tao Feng
Have you done a profiling on your broker process? Any hot code path differences between these two versions? Thanks, -Tao On Fri, Aug 21, 2015 at 3:59 PM, Rajiv Kurian wrote: > The only thing I notice in the logs which is a bit unsettling is about a > once a second rate of messages of the type >

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
All the new brokers are running 0.8.2.1 so I can only profile the new version and not the old one any more without reverting the change on some of the brokers. The restart of brokers causes clients to lose a select few messages so its not very desirable. Profiling the new brokers using jVisualVm (

bootstrap.servers for the new Producer

2015-08-21 Thread Kishore Senji
If one of the broker we specify in the bootstrap servers list is down, there is a chance that the Producer (a brand new instance with no prior metadata) will never be able to publish anything to Kafka until that broker is up. Because the logic for getting the initial metadata is based on some rando

Re: bootstrap.servers for the new Producer

2015-08-21 Thread Ewen Cheslack-Postava
Are you seeing this in practice or is this just a concern about the way the code currently works? If the broker is actually down and the host is rejecting connections, the situation you describe shouldn't be a problem. It's true that the NetworkClient chooses a fixed nodeIndexOffset, but the expect

Re: bootstrap.servers for the new Producer

2015-08-21 Thread Kishore Senji
Thank you Ewen. This behavior is something that I'm observing. I see in the logs continuous Connect failures to the dead broker. The important thing here is I'm starting a brand new instance of the Producer after a broker is down (so no prior metadata), with that down broker also as part of the bo