Kafka 9 consumer generates EOF exceptions when parsing responses

2016-09-01 Thread Rajiv Kurian
I've seen these before but we have recently moved to the 0.9 consumer code for one of our big Kafka use cases and now we see about 1 million EOF exceptions in a 5 minute period. This can't be very good for performance. My guess is that these exceptions are expected since it uses DataInputStream to

Kafka 0.9 consumer gets stuck in epoll

2016-08-29 Thread Rajiv Kurian
We had a Kafka 0.9 consumer stuck in the epoll native call under the following circumstances. 1. It was started bootstrapped with a cluster with 3 brokers A, B and C with ids 1,2,3. 2. Change the assignment of the brokers to some topic partitions. Seek to the beginning of each topic partition. 3.

Deleting a topic on the 0.8x brokers

2016-07-14 Thread Rajiv Kurian
We plan to stop using a particular Kafka topic running on a certain subset of a 0.82x cluster. This topic is served by 9 brokers (leaders + replicas) and these 9 brokers have no other topics on them. Once we have stopped sending and consuming traffic from this topic (and hence the 9 brokers) what

Re: [DISCUSS] Java 8 as a minimum requirement

2016-06-16 Thread Rajiv Kurian
+1 On Thu, Jun 16, 2016 at 1:45 PM, Ismael Juma wrote: > Hi all, > > I would like to start a discussion on making Java 8 a minimum requirement > for Kafka's next feature release (let's say Kafka 0.10.1.0 for now). This > is the first discussion on the topic so the idea is to understand how > peo

Debugging high log flush latency

2016-04-27 Thread Rajiv Kurian
We monitor the log flush latency p95 on all our Kafka nodes and occasionally we see it creep up from the regular figure of under 15 ms to above 150 ms. Restarting the node usually doesn't help. It seems to fix itself over time but we are not quite sure about the underlying reason. It's bytes-in/se

Re: Is there a tool to list the topic-partitions on a particular broker

2016-04-21 Thread Rajiv Kurian
uot;kafka-manager", which was opened sourced > by > > Yahoo. > > https://github.com/yahoo/kafka-manager > > > > > >> On 21 April 2016 at 08:07, Rajiv Kurian wrote: > >> > >> The kafka-topics.sh tool lists topics and where the partitions a

Is there a tool to list the topic-partitions on a particular broker

2016-04-20 Thread Rajiv Kurian
The kafka-topics.sh tool lists topics and where the partitions are. Is there a similar tool where I could give it a broker id and it would give me all the topic-partitions on it? I want to bring down a few brokers but before doing that I want to make sure that I've migrated all topics away from it,

Re: two questions

2016-03-21 Thread Rajiv Kurian
They don't work with the old brokers. We made the assumption that they did and had to roll-back. On Mon, Mar 21, 2016 at 10:42 AM, Alexis Midon < alexis.mi...@airbnb.com.invalid> wrote: > Hi Ismael, > > could you elaborate on "newer clients don't work with older brokers > though."? doc pointers a

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
Thanks Jason. I'll try to upgrade and see if it helps. On Mon, Mar 14, 2016 at 12:04 PM, Jason Gustafson wrote: > I think this is the one: https://issues.apache.org/jira/browse/KAFKA-2978. > > -Jason > > On Mon, Mar 14, 2016 at 11:54 AM, Rajiv Kurian wrote: > > >

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
@Jason, can you please point me to the bug that you were talking about in 0.9.0.0? On Mon, Mar 14, 2016 at 11:36 AM, Rajiv Kurian wrote: > No I haven't. It's still running the 0.9.0 client. I'll try upgrading if > it sounds like an old bug. > > On Mon, Mar 14, 2016 a

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
ents to 0.9.0.1? > > -Jason > > On Mon, Mar 14, 2016 at 11:18 AM, Rajiv Kurian wrote: > > > Has any one run into similar problems. I have experienced the same > problem > > again. This time when I use kafka-consumer-groups.sh tool it says that my > > consumer grou

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
. Again I have a single consumer group per topic with a single consumer in that group. Wondering it this causes some edge case. This consumer is up as of now, so I don't know why it would say it is rebalancing. On Wed, Mar 9, 2016 at 11:05 PM, Rajiv Kurian wrote: > Thanks! That worked.

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
sh --bootstrap-server localhost:9092 --list > --new-consumer > > > On Thu, Mar 10, 2016 at 12:02 PM, Rajiv Kurian wrote: > > > Hi Guozhang, > > > > I tried using the kafka-consumer-groups.sh --list command and it says I > > have no consumer groups set up at

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
etadata you can use the ConsumerGroupCommand, wrapped > in bin/kafka-consumer-groups.sh. > > Guozhang > > On Wed, Mar 9, 2016 at 5:48 PM, Rajiv Kurian wrote: > > > Don't think I made my questions clear: > > > > On Kafka 0.9.0.1 broker and 0.9 consumer how do I tell what my

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
o it i.e. it was not provisioned from before. Messages currently are only being sent to partition 0 even though there are 8 partitions per topic. Thanks, Rajiv On Wed, Mar 9, 2016 at 4:30 PM, Rajiv Kurian wrote: > Also forgot to mention that when I do consume with the console consumer I > do see d

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
Also forgot to mention that when I do consume with the console consumer I do see data coming through. On Wed, Mar 9, 2016 at 3:44 PM, Rajiv Kurian wrote: > I am running the 0.9.0.1 broker with the 0.9 consumer. I am using the > subscribe feature on the consumer to subscribe to a topic

Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
I am running the 0.9.0.1 broker with the 0.9 consumer. I am using the subscribe feature on the consumer to subscribe to a topic with 8 partitions. consumer.subscribe(Arrays.asList(myTopic)); I have a single consumer group for said topic and a single process subscribed with 8 partitions. When I u

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
t what I find on the JIRA. > > -Jason > > On Fri, Feb 5, 2016 at 9:50 AM, Rajiv Kurian wrote: > > > I've updated Kafka-3159 with my findings. > > > > Thanks, > > Rajiv > > > > On Thu, Feb 4, 2016 at 10:25 PM, Rajiv Kurian > wrote: > &

Re: FW: 0.9 consumer log spam - Marking the coordinator dead

2016-02-05 Thread Rajiv Kurian
Thanks for the update Ismael. On Fri, Feb 5, 2016 at 10:31 AM, Ismael Juma wrote: > Hi Rajiv, > > Jun just sent a message about 0.9.0.1. It should be out soon if everything > goes well. > > Ismael > > On Fri, Feb 5, 2016 at 5:48 PM, Rajiv Kurian wrote: > > > Hi

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
I've updated Kafka-3159 with my findings. Thanks, Rajiv On Thu, Feb 4, 2016 at 10:25 PM, Rajiv Kurian wrote: > I think I found out when the problem happens. When a broker that is sent a > fetch request has no messages for any of the partitions it is being asked > messages f

Re: FW: 0.9 consumer log spam - Marking the coordinator dead

2016-02-05 Thread Rajiv Kurian
Hi Ismael, Is there a maven release planned soon? We've seen this problem too and it is rather disconcerting. Thanks, Rajiv On Fri, Feb 5, 2016 at 5:15 AM, Ismael Juma wrote: > Hi Simon, > > It may be worth trying the 0.9.0 branch as it includes a number of > important fixes to the new consume

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
n Thu, Feb 4, 2016 at 8:58 PM, Rajiv Kurian wrote: > I actually restarted my application with the consumer config I mentioned > at https://issues.apache.org/jira/browse/KAFKA-3159 and I can't get it to > use high CPU any more :( Not quite sure about how to proceed. I'll try to &

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
he problem happens under those conditions. On Thu, Feb 4, 2016 at 8:40 PM, Rajiv Kurian wrote: > Hey Jason, > > Yes I checked for error codes. There were none. The message was perfectly > legal as parsed by my hand written parser. I also verified the size of the > response which was exact

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
p with a way to reproduce > it, that will help immensely. > > Also, would you mind updating KAFKA-3159 with your findings about the high > CPU issue? If the problem went away after a configuration change, does it > come back when those changes are reverted? > > Thanks, > Jason &

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
v On Thu, Feb 4, 2016 at 4:56 PM, Rajiv Kurian wrote: > And just like that it stopped happening even though I didn't change any of > my code. I had filed https://issues.apache.org/jira/browse/KAFKA-3159 > where the stock 0.9 kafka consumer was using very high CPU and seeing a lot &g

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
tting the same problem (lots of empty messages) even though we asked the broker to park the request till enough bytes came through. On Thu, Feb 4, 2016 at 3:21 PM, Rajiv Kurian wrote: > I am writing a Kafka consumer client using the document at > https://cwiki.apache.org/confluence/disp

Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
I am writing a Kafka consumer client using the document at https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol One place where I am having problems is the fetch request itself. I am able to send fetch requests and can get fetch responses that I can parse properly, but i

Trying to figure out the protocol.

2016-02-01 Thread Rajiv Kurian
I am trying to write a Kafka client (specifically a consumer) and am using the protocol document at https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol . Had a specific question on the Offset Request API: OffsetRequest => ReplicaId [TopicName [Partition Time MaxNumberO

Re: Getting very poor performance from the new Kafka consumer

2016-01-27 Thread Rajiv Kurian
early at the rate that you're > reporting. That might be a factor of the number of partitions, so I'll do > some investigation. > > -Jason > > On Wed, Jan 27, 2016 at 8:40 AM, Rajiv Kurian wrote: > > > Hi Guozhang, > > > > The Github link I pasted was fr

Re: Getting very poor performance from the new Kafka consumer

2016-01-27 Thread Rajiv Kurian
On Tue, Jan 26, 2016 at 10:44 PM, Guozhang Wang wrote: > Rajiv, > > Could you try to build the new consumer from 0.9.0 branch and see if the > issue can be re-produced? > > Guozhang > > On Mon, Jan 25, 2016 at 9:46 PM, Rajiv Kurian wrote: > > > The

Re: Stuck consumer with new consumer API in 0.9

2016-01-26 Thread Rajiv Kurian
Thanks Jun. On Tue, Jan 26, 2016 at 3:48 PM, Jun Rao wrote: > Rajiv, > > We haven't released 0.9.0.1 yet. To try the fix, you can build a new client > jar off the 0.9.0 branch. > > Thanks, > > Jun > > On Mon, Jan 25, 2016 at 12:03 PM, Rajiv Kurian wrote:

Re: Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
The exception seems to be thrown here https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/record/MemoryRecords.java#L236 Is this not expected to hit often? On Mon, Jan 25, 2016 at 9:22 PM, Rajiv Kurian wrote: > Wanted to add that we are not using a

Re: Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
poor performance. On Mon, Jan 25, 2016 at 9:20 PM, Rajiv Kurian wrote: > We are using the new kafka consumer with the following config (as logged > by kafka) > > metric.reporters = [] > > metadata.max.age.ms = 30 > > va

Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
We are using the new kafka consumer with the following config (as logged by kafka) metric.reporters = [] metadata.max.age.ms = 30 value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer group.id = myGroup.id partition.assignmen

Re: Stuck consumer with new consumer API in 0.9

2016-01-25 Thread Rajiv Kurian
Gustafson wrote: > Hey Rajiv, the bug was on the client. Here's a link to the JIRA: > https://issues.apache.org/jira/browse/KAFKA-2978. > > -Jason > > On Mon, Jan 25, 2016 at 11:42 AM, Rajiv Kurian wrote: > > > Hi Jason, > > > > Was this a server bu

Re: Stuck consumer with new consumer API in 0.9

2016-01-25 Thread Rajiv Kurian
Hi Jason, Was this a server bug or a client bug? Thanks, Rajiv On Mon, Jan 25, 2016 at 11:23 AM, Jason Gustafson wrote: > Apologies for the late arrival to this thread. There was a bug in the > 0.9.0.0 release of Kafka which could cause the consumer to stop fetching > from a partition after a

Kafka 0.9 client producer compatibility with Kafka 0.8.2 broker

2016-01-13 Thread Rajiv Kurian
We just upgraded one of our Kafka client producers from 0.8.2 to 0.9. Our broker is still running 0.8.2. I knew that the new 0.9 consumer requires the new broker and I was under the impression that the new producer would still work with the old broker. However this doesn't seem to be the case. I k

Re: fallout from upgrading to the new Kafka producers

2016-01-13 Thread Rajiv Kurian
just searching "producer" in the > release notes: > > http://mirror.stjschools.org/public/apache/kafka/0.9.0.0/RELEASE_NOTES.html > > > Guozhang > > > On Tue, Jan 12, 2016 at 6:00 PM, Rajiv Kurian wrote: > > > Thanks Guozhan. I have upgraded to

Re: fallout from upgrading to the new Kafka producers

2016-01-12 Thread Rajiv Kurian
otify" > the client, etc. > > Guozhang > > > > > On Mon, Jan 11, 2016 at 1:08 PM, Rajiv Kurian wrote: > > > We have recently upgraded some of our applications to use the Kafka 0.8.2 > > Java producers from the old Java wrappers over Scala producers. > > &g

fallout from upgrading to the new Kafka producers

2016-01-11 Thread Rajiv Kurian
We have recently upgraded some of our applications to use the Kafka 0.8.2 Java producers from the old Java wrappers over Scala producers. We've noticed these log messages on our application since the upgrade: 2016-01-11T20:56:43.023Z WARN [roducer-network-thread | producer-2] [s.o.a.kafka.common

Re: 0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
g to be called from a single thread any blocking call can starve other multiplexed partitions even if their brokers are fine. So this could lead to one down broker causing the entire consumer to come to a grind. A user might instead decided to give up after some timeout and do something m

Re: 0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
th " > log.segment.delete.delay.ms." If you can estimate the incoming message > rate, then you can probably tune these settings to get a retention policy > closer to what you're looking for. See here for more info on broker > configuration: https://kafka.apache.org/document

0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
I want to use the new 0.9 consumer for a particular application. My use case is the following: i) The TopicPartition I need to poll has a short log say 10 mins odd (log.retention.minutes is set to 10). ii) I don't use a consumer group i.e. I manage the partition assignment myself. iii) Whenever

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
Dec 17, 2015 5:56 PM, "Rajiv Kurian" wrote: > > > Yes we are in the process of upgrading to the new producers. But the > > problem seems deeper than a compatibility issue. We have one environment > > where the old producers work with the new 0.9 broker. Further wh

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
> > Also, when those producers had the issue, were there any other things weird > in the broker (e.g., broker's ZK session expires)? > > Thanks, > > Jun > > On Thu, Dec 17, 2015 at 2:37 PM, Rajiv Kurian wrote: > > > I can't think of anything special about th

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
gt; > Jun > > On Thu, Dec 17, 2015 at 12:41 PM, Rajiv Kurian wrote: > > > The topic which stopped working had clients that were only using the old > > Java producer that is a wrapper over the Scala producer. Again it seemed > to > > work perfectly in another of o

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
12:23 PM, Jun Rao wrote: > Are you using the new java producer? > > Thanks, > > Jun > > On Thu, Dec 17, 2015 at 9:58 AM, Rajiv Kurian wrote: > > > Hi Jun, > > Answers inline: > > > > On Thu, Dec 17, 2015 at 9:41 AM, Jun Rao wrote: > &

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
(total guess). Again our other (lower traffic) cluster that was upgraded was totally fine so it doesn't seem like it happens all the time. > > Jun > > > On Tue, Dec 15, 2015 at 12:52 PM, Rajiv Kurian wrote: > > > We had to revert to 0.8.3 because three of our topic

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
Then it's just one copy to convert it to a list, and we can > fix this by adding the assign() variant I suggested above. > > By the way, here's a link to the JIRA I created: > https://issues.apache.org/jira/browse/KAFKA-2991. > > -Jason > > On Tue, Dec 15, 201

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
ofiling shows that it is actually a problem. Are your partition > assignments generally very large? > > -Jason > > > On Tue, Dec 15, 2015 at 1:32 PM, Rajiv Kurian wrote: > > > We are trying to use the Kafka 0.9 consumer API to poll specific > > partitions. We consume

Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
We are trying to use the Kafka 0.9 consumer API to poll specific partitions. We consume partitions based on our own logic instead of delegating that to Kafka. One of our use cases is handling a change in the partitions that we consume. This means that sometimes we need to consume additional partiti

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-15 Thread Rajiv Kurian
ce all the brokers were running 0.9 code with inter.broker.protocol.version=0.8.2.X I restarted them one by one with the 0.8.2.3 broker code. This however like I mentioned did not fix the three broken topics. On Mon, Dec 14, 2015 at 3:13 PM, Rajiv Kurian wrote: > Now that it has been a bit longe

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-14 Thread Rajiv Kurian
Now that it has been a bit longer, the spikes I was seeing are gone but the CPU and network in/out on the three brokers that were showing the spikes are still much higher than before the upgrade. Their CPUs have increased from around 1-2% to 12-20%. The network in on the same brokers has gone up fr

Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-14 Thread Rajiv Kurian
I upgraded one of our Kafka clusters (9 nodes) from 0.8.2.3 to 0.9 following the instructions at http://kafka.apache.org/documentation.html#upgrade Most things seem to work fine based on our metrics. Something I noticed is that the network out on 3 of the nodes goes up every 5-6 minutes. I see a c

Re: Any gotchas upgrading to 0.9?

2015-12-14 Thread Rajiv Kurian
Scratch that. On more careful observation I do see this in the logs: inter.broker.protocol.version = 0.8.2.X On Mon, Dec 14, 2015 at 10:25 AM, Rajiv Kurian wrote: > I am in the process of updating to 0.9 and had another question. > > The docs at http://kafka.apache.org/documenta

Re: Any gotchas upgrading to 0.9?

2015-12-14 Thread Rajiv Kurian
; upgrade path is server-first, clients later. > > Filed https://issues.apache.org/jira/browse/KAFKA-2923 to update the > upgrade doc to include it. > > Guozhang > > On Tue, Dec 1, 2015 at 11:14 AM, Rajiv Kurian wrote: > > > Thanks folks. Anywhere I can read about client

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
m the ISR. The mechanism of detecting > > > slow replicas has changed - if a replica starts lagging behind the > leader > > > for longer than replica.lag.time.max.ms, then it is considered too > slow > > > and is removed from the ISR. So even if there is a spike in traffic

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
e of magic defaults. For example we have a certain cluster dedicated to serving a single important topic and we'd hate for it to be throttled because of incorrect defaults. Thanks, Rajiv On Tue, Dec 1, 2015 at 8:54 AM, Rajiv Kurian wrote: > I saw the upgrade path documentation at > ht

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
I saw the upgrade path documentation at http://kafka.apache.org/documentation.html and that kind of answers (1). Not sure if there is anything about client compatibility though. On Tue, Dec 1, 2015 at 8:51 AM, Rajiv Kurian wrote: > I plan to upgrade both the server and clients to 0.9. Ha

Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
I plan to upgrade both the server and clients to 0.9. Had a few questions before I went ahead with the upgrade: 1. Do all brokers need to be on 0.9? Currently we are running 0.8.2. We'd ideally like to convert only a few brokers to 0.9 and only if we don't see problems convert the rest. 2. Is it

Re: Partial broker shutdown causing producers to stall even with replicas

2015-11-16 Thread Rajiv Kurian
Yes I think so. We specifically upgraded the Kafka broker with a patch to avoid the ZK client NPEs. Guess not all of them are fixed. The Kafka broker becoming a zombie even if one ZK node is bad is especially terrible. On Tuesday, November 17, 2015, Mahdi Ben Hamida wrote: > Hello, > > See below

Re: Questions on the new Kafka consumer

2015-10-13 Thread Rajiv Kurian
consumer's position what does the consumer group do? Do I still have to specify it all the time? Thanks, Rajiv On Tue, Oct 13, 2015 at 1:14 PM, Rajiv Kurian wrote: > I was reading the documentation for the new Kafka consumer API at > https://github.com/apache/kafka/blob/trunk/clients/src

Questions on the new Kafka consumer

2015-10-13 Thread Rajiv Kurian
I was reading the documentation for the new Kafka consumer API at https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java and came across this: "Each Kafka consumer must specify a consumer group that it belongs to." Currently we use Ka

Re: Kafka 0.9.0 release branch

2015-10-13 Thread Rajiv Kurian
A bit off topic but does this release contain the new single threaded consumer that supports the poll interface? Thanks! On Mon, Oct 12, 2015 at 1:31 PM, Jun Rao wrote: > Hi, Everyone, > > As we are getting closer to the 0.9.0 release, we plan to cut an 0.9.0 > release branch in about two weeks

Re: Debugging high log flush latency on a broker.

2015-09-22 Thread Rajiv Kurian
ng blocks from that file back onto the > filesystem free list (or whatever data structure it is these days (-: ). > > -Steve > > On Tue, Sep 22, 2015 at 11:46:49AM -0700, Rajiv Kurian wrote: > > Also any hints on how I can find the exact topic/partitions assigned to &

Debugging high log flush latency on a broker.

2015-09-22 Thread Rajiv Kurian
I have a particular broker(version 0.8.2.1) in a cluster receiving about 15000 messages/second of around 100 bytes each (bytes-in / messages-in). This broker has bursts of really high log flush latency p95s. The latency sometimes goes to above 1.5 seconds from a steady state of < 20 ms. Running

Re: Debugging periodic high kafka log flush times

2015-09-21 Thread Rajiv Kurian
a 100 bytes each) I'd think we are over-provisioned. But we still see periodic jumps in log flush latency. Any hints on what else we might measure/check etc to figure this out? On Thu, Sep 17, 2015 at 4:39 PM, Rajiv Kurian wrote: > We have a 9 node cluster running 0.8.2.1 that does ar

Debugging periodic high kafka log flush times

2015-09-17 Thread Rajiv Kurian
We have a 9 node cluster running 0.8.2.1 that does around 545 thousand messages(kafka-messages-in) per second. Each of our brokers has 30 G of memory and 16 cores. We give the brokers themselves 2G of heap. Each broker ranges from around 33 - 40% cpu utilization. The values for both kafka-bytes-in

Re: Closing connection messages

2015-09-17 Thread Rajiv Kurian
ith one of the larger > sets of changes). In general, I'd say you should just ignore it for now. > > -Todd > > > On Wed, Sep 16, 2015 at 9:55 PM, Rajiv Kurian wrote: > > > My broker logs are full of messages of the following type of log message: > > > > INF

Closing connection messages

2015-09-17 Thread Rajiv Kurian
My broker logs are full of messages of the following type of log message: INFO [kafka-network-thread-9092-1] [kafka.network.Processor ]: Closing socket connection to /some_ip_that_I_know. I see at least one every 4-5 seconds. Something I identified was that the ip of the closed cl

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
iling on your broker process? Any hot code path > differences between these two versions? > > Thanks, > -Tao > > On Fri, Aug 21, 2015 at 3:59 PM, Rajiv Kurian > wrote: > > > The only thing I notice in the logs which is a bit unsettling is about a > > once a sec

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
s to close connections with are running the Java wrapper over the Scala SimpleConsumer. Is there any logging I can enable to understand why exactly these connections are being closed so often? Thanks, Rajiv On Fri, Aug 21, 2015 at 3:50 PM, Rajiv Kurian wrote: > We upgraded a 9 broker cluster from

Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
We upgraded a 9 broker cluster from version 0.8.1 to version 0.8.2.1. Actually we cherry-picked the commit at 41ba26273b497e4cbcc947c742ff6831b7320152 to get zkClient 0.5 because we ran into a bug described at https://issues.apache.org/jira/browse/KAFKA-824 Right after the update the CPU spiked q

Re: My emails don't seem to go through

2015-08-21 Thread Rajiv Kurian
Weird I don't see my emails on the mailing list archive. On Fri, Aug 21, 2015 at 2:13 PM, David Luu wrote: > I got your email from the list? > > On Fri, Aug 21, 2015 at 1:56 PM, Rajiv Kurian > wrote: > > > Wondering why my emails to the mailing list don't go thro

My emails don't seem to go through

2015-08-21 Thread Rajiv Kurian
Wondering why my emails to the mailing list don't go through.

Re: Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
The problem was that the metric names had all changed in the latest version. Fixing the names seems to have done it. On Thu, Aug 13, 2015 at 3:13 PM, Rajiv Kurian wrote: > Aah that seems like a red herring - seems like the underlying cause is > that the MBeans I was trying to poll (throu

Re: Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
erver":type="BrokerTopicMetrics",name="My_Topic-FailedProduceRequestsPerSec" On Thu, Aug 13, 2015 at 2:27 PM, Rajiv Kurian wrote: > Till recently we were on 0.8.1 and updated to 0.8.2.1 > > Everything seems to work but I am no longer seeing metrics reported fr

Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
Till recently we were on 0.8.1 and updated to 0.8.2.1 Everything seems to work but I am no longer seeing metrics reported from the broker that was updated to the new version. My config file has the following lines: kafka.metrics.polling.interval.secs=5 kafka.metrics.reporters=kafka.metrics.Kafka

Re: Trying to debug a kafka failure

2015-08-13 Thread Rajiv Kurian
as to why the restart fixed it either. On Wed, Aug 12, 2015 at 1:52 PM, Rajiv Kurian wrote: > We run around 10 kafka brokers running 0.8.1. > > This morning we had a failure were some of the partitions were under > replicated. We narrowed the problem down to the following metrics on tha

Re: Best way to replace a single broker

2015-05-14 Thread Rajiv Kurian
er this is done (all replicas are in sync) you can trigger leader > election (or preferred replica election, whatever it is called) if it does > not happen automatically. > > -- > Andrey Yegorov > > On Thu, May 14, 2015 at 11:12 AM, Rajiv Kurian > wrote: > > >

Best way to replace a single broker

2015-05-14 Thread Rajiv Kurian
Hi all, Sometimes we need to replace a kafka broker because it turns out to be a bad instance. What is the best way of doing this? We have been using the kafka-reassign-partitions.sh to migrate all topics to the new list of brokers which is the (old list + the new instance - the bad instance). T

Kafka log flush time outlier

2015-05-13 Thread Rajiv Kurian
I have a single broker in a cluster of 9 brokers that has a log-flush-time-99th of 260 ms or more. Other brokers have a log-flush-time-99th of less than 30 ms. The misbehaving broker is running on the same kind of machine (c3.4x on Ec2) that the other ones are running on. It's bytes-in, bytes-out,

What is the expected way to deal with out of space on brokers

2015-04-06 Thread Rajiv Kurian
I have had some brokers die because of lack of disk space. The logs for all partitions were way higher (5G+) than I would have expected given the how I configured them for (100 MB size AND 1h rollover). What is the recommended way of recovering from this error. Should I delete certain log files a

Re: Kafka 0.9 consumer API

2015-03-26 Thread Rajiv Kurian
; producer that can be probably re-used for the consumer as well. > > org.apache.kafka.clients.producer.internals.BufferPool > > Please feel free to add more comments on KAFKA-2045. > > Guozhang > > > On Tue, Mar 24, 2015 at 12:21 PM, Rajiv Kurian > wrote: > >

Re: Kafka 0.9 consumer API

2015-03-24 Thread Rajiv Kurian
ou think? > > As for de-compression, today we allocate new buffer for each de-compressed > message, and it may be required to do de-compression with re-useable buffer > with memory control. I will create a ticket under KAFKA-1326 for that. > > Guozhang > > Guozhang > > &g

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
ually get > > equivalent stuff out of the jvm allocator's own pooling and/or escape > > analysis once we are doing the allocation on demand. So it would be good > to > > show a real performance improvement on the newer JVMs before deciding to > go > > this route. > &

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
ce improvement on the newer JVMs before deciding to go > this route. > > -Jay > > > On Sat, Mar 21, 2015 at 9:17 PM, Rajiv Kurian > wrote: > > > Just a follow up - I have implemented a pretty hacky prototype It's too > > unclean to share right now but I can

Re: Kafka 0.9 consumer API

2015-03-21 Thread Rajiv Kurian
I also have absolutely zero copies in user space once the data lands from the socket onto the ByteBuffer. On Sat, Mar 21, 2015 at 11:16 AM, Rajiv Kurian wrote: > I had a few more thoughts on the new API. Currently we use kafka to > transfer really compact messages - around 25-35 bytes each

Re: Kafka 0.9 consumer API

2015-03-21 Thread Rajiv Kurian
> > > On Friday, March 20, 2015 4:20 PM, Rajiv Kurian > wrote: > > > Awesome - can't wait for this version to be out! > > On Fri, Mar 20, 2015 at 12:22 PM, Jay Kreps wrote: > > > The timeout in the poll call is more or less the timeout used by the >

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
> well > > as run periodic jobs. For the periodic jobs to run we need a guarantee on > > how much time the poll call can take at most. > > > > Thanks! > > > > On Fri, Mar 20, 2015 at 6:59 AM, Rajiv Kurian > > wrote: > > > > > Thanks! &

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
. When this API is available we plan to use a single thread to get data from kafka, process them as well as run periodic jobs. For the periodic jobs to run we need a guarantee on how much time the poll call can take at most. Thanks! On Fri, Mar 20, 2015 at 6:59 AM, Rajiv Kurian wrote: > Tha

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
rk in progress is documented here: > > > > > > On Thu, Mar 19, 2015 at 7:18 PM, Rajiv Kurian > > > wrote: > > > >> Is there a link to the proposed new consumer non-blocking API? > >> > >> Thanks, > >> Rajiv > >> > > > > >

Kafka 0.9 consumer API

2015-03-19 Thread Rajiv Kurian
Is there a link to the proposed new consumer non-blocking API? Thanks, Rajiv

Recommended way of handling brokers coming up / down in the SimpleConsumer

2015-02-10 Thread Rajiv Kurian
I am using the SimpleConsumer to consume specific partitions on specific processes. The workflow is kind of like this: i) An external arbiter assigns partitions to a specific processes. It provides the guarantees of: a) All partitions are consumed by the cluster. b) A single partition is onl

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-27 Thread Rajiv Kurian
:0.8.0] On Tue, Jan 27, 2015 at 10:19 AM, Rajiv Kurian wrote: > I am using 0.8.1. The source is here: > https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/consumer/SimpleConsumer.scala > > Here is the definition of disconnect(): > private def disconnect

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-27 Thread Rajiv Kurian
Tue, Jan 27, 2015 at 9:03 AM, Guozhang Wang wrote: > Rajiv, > > Which version of Kafka are you using? I just checked SimpleConsumer's code, > and in its close() function, disconnect() is called, which will close the > socket. > > Guozhang > > > On Mon, Jan 26, 20

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-26 Thread Rajiv Kurian
) { logger.error(e); // Assume UnresolvedAddressException. if (consumer != null) { simpleConsumer.close(); simpleConsumer = null; } } } } On Mon, Jan 26, 2015 at 2:27 PM, Rajiv Kurian wrote: > Here is my typical flow: > void run() { > if (simpleConsume

Re: Kafka sending messages with zero copy

2015-01-26 Thread Rajiv Kurian
this proposal, it would be great if you can upload some > implementation patch for the CAS idea and show some memory usage / perf > differences. > > Guozhang > > On Sun, Dec 14, 2014 at 9:27 PM, Rajiv Kurian > wrote: > > > Resuscitating this thread. I've done som

SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-26 Thread Rajiv Kurian
Here is my typical flow: void run() { if (simpleConsumer == null) { simpleConsumer = new SimpleConsumer(host, port, (int) kafkaSocketTimeout, kafkaRExeiveBufferSize, clientName); } try { // Do stuff with simpleConsumer. } catch (Exception e) { if (consumer != null) { si

Re: Trying to figure out kafka latency issues

2014-12-30 Thread Rajiv Kurian
ient to see what is happening when the pauses occur. > > Alternately if you can find a reproducible test case we can turn into a > JIRA someone else may be willing to dive in. > > -Jay > > On Tue, Dec 30, 2014 at 4:37 PM, Rajiv Kurian > wrote: > > > Got it. I had to ena

  1   2   >