Re: Zookeeper DNS TTL

2016-03-19 Thread David DeMaagd
https://issues.apache.org/jira/browse/ZOOKEEPER-1356 which was closed as a dupe of https://issues.apache.org/jira/browse/ZOOKEEPER-338 are relevant to this... It's a zk client issue, and there are things you can do to avoid having to reconfigure the clients while you're bouncing them (CNAMEs and t

Re: GC pauses and rebalance failures

2014-04-14 Thread David DeMaagd
ne can finish. > > > On Mon, Apr 14, 2014 at 12:58 PM, David DeMaagd wrote: > > > Correct - heavy client GC leads to numerous problems. There's > > two things you can do: > > > > 1) Tune the client JVM better to get GC to a more reasonable level > &

Re: GC pauses and rebalance failures

2014-04-14 Thread David DeMaagd
Correct - heavy client GC leads to numerous problems. There's two things you can do: 1) Tune the client JVM better to get GC to a more reasonable level 2) Increase the zookeeper session timeout value (this is generally a work-around for #1, but it can buy you time to dig into it) -- Dave D

Re: where is my kafka ip address in ZK. I get nothing in a broker ID

2014-02-12 Thread David DeMaagd
That information is in that node, not under it (you want a get() instead of a get_children())... -- Dave DeMaagd | S'aite Reliability Engineering, Y'all ddema...@linkedin.com | 818 262 7958 (davidmontgom...@gmail.com - Thu, Feb 13, 2014 at 06:09:07AM +0800) > Hi, > > I am using kafka 8.0. > >

Re: Is there a way to get the offset of a consumer of a topic?

2013-12-04 Thread David DeMaagd
You can use either the MaxLag MBean (0.8): http://kafka.apache.org/documentation.html#monitoring Or the ConsumerOffsetChecker (0.7 or 0.8, can't seem to find a doc reference for it): ./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker ... -- Dave DeMaagd | S'aite Reliability Engineering,

Re: Kafka Important Metrics

2013-07-03 Thread David DeMaagd
I've also used jolokia, http://jolokia.org/, though it can get a little slow to respond if you don't use it right. Have rolled a JMX/HTTP 'data dumper' from scratch (can be done in a couple hundred lines of Java without too much issue)... -- Dave DeMaagd ddema...@linkedin.com | 818 262 7958 (c

Re: Log4J setting for kafka

2013-07-01 Thread David DeMaagd
The danger of using a size based rollover (unless you set the size and log rollover to be fairly high) is that in case of problems, the actual cause of the problem might get rolled off the end by the time you get to it (kafka can be very chatty in some kinds of failure cases). That is probably the

Re: Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition

2013-06-28 Thread David DeMaagd
hour has passed since short downtime and I still see the exception in > kafka service logs. > > Thanks, > Vadim > > > On Fri, Jun 28, 2013 at 11:25 AM, David DeMaagd wrote: > > > Getting kafka.common.NotLeaderForPartitionException for a time after a > >

Re: Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition

2013-06-28 Thread David DeMaagd
Getting kafka.common.NotLeaderForPartitionException for a time after a node is brought back on line (especially if it is a short downtime) is normal - that is because the consumers have not yet completely picked up the new leader information. If should settle shortly. -- Dave DeMaagd ddema...@l

Re: exception report

2013-05-13 Thread David DeMaagd
The loas+found directory is part of the Linux extN filesystem semantics, and yes, it would be a terribly idea to try to remove it - it is automatically there at the top level of a disk mount point. Because it being there will mess up kafka. it is a good idea to create a subdirectory there that

Re: Kafka Monitoring, 0.7 vs. 0.8 JMX

2013-05-08 Thread David DeMaagd
I think there's really two angles to look at this from... 1) What is 'important' to monitor? Meaning, what subset of these are important/critical for being able to tell system health (things you want to set alerts on), what subset are nice to have for overall health and capacity planning (things

Re: java, oom, gc stop the world

2013-01-22 Thread David DeMaagd
It's worth noting that we currently run kakfa at LinkedIn with a 5G heap (not 3G, still using the CMS GC though - should update that), and the info on that wiki is aimed at 0.7. We are actively working on things for 0.8 - don't have a 'this works for us', much less a 'recommendedation' there

Re: zookeeper interactions

2012-12-17 Thread David DeMaagd
The zookeeper connections are persistent, so it depends on the number of clients more than the data flow rate on the producer side. If you are using a VIP based producer, then there is no connection from the producer process to zookeeper at all. If you are using a zookeper based producer, then yo

Re: FW: Zookeeper Configuration Question

2012-12-10 Thread David DeMaagd
If you're using the zkCli.sh, something like this will create the namespace: [zk: localhost:12913(CONNECTED) 1] create /namespace '' Created /namespace If you're using another interface, the actual command may vary. -- Dave DeMaagd ddema...@linkedin.com | 818 262 7958 (casey.sybra...@six3syst