When is the status of isr updated

2014-03-25 Thread Jie Li
Hello, We observed in some cases the isr were outdated: healthy brokers may be missing from isr, or dead brokers may still stay in isr for a while. My guess is the list of isr is only updated upon a few events, e.g. when a message is written to the partition, or when the brokers come online? Tha

Re: Some brokers taking a long time to restart

2014-03-25 Thread Jie Li
Thanks will make a note of this and find the log later. On Tue, Mar 25, 2014 at 8:58 PM, Neha Narkhede wrote: > Hi Jie, > > If a broker shuts down cleanly, it leaves a .clean_shutdown file in the > broker's data directory (log.dirs). On startup, a broker skips recovery of > log files if it finds

Re: Can I delete old log files manually?

2014-03-25 Thread Neha Narkhede
I have my purge set to 3 days and there are logs in there that are 3 days old but haven't been purged yet. Older log segments are purged only when the log segment rolls over. This is controlled by the log.segment.bytes config. You might want to lower this in your cluster to let Kafka purge the se

Re: producers limit

2014-03-25 Thread Neha Narkhede
You shouldn't have any problem with that. We frequently have 10s of thousands of producer connections to a 8-10 node cluster at all times. You might have to bump up the limit for the number of open file handles per broker though. Thanks, Neha On Tue, Mar 25, 2014 at 3:41 PM, Kane Kane wrote: >

Re: Some brokers taking a long time to restart

2014-03-25 Thread Neha Narkhede
Hi Jie, If a broker shuts down cleanly, it leaves a .clean_shutdown file in the broker's data directory (log.dirs). On startup, a broker skips recovery of log files if it finds that file. If it doesn't, then it goes through a recovery process where the last segment for every partition is checked f

Re: ZooKeeper connect/disconnect pattern

2014-03-25 Thread Neha Narkhede
Could you send around the corresponding broker log during the same timeframe? There are several connection attempts to zookeeper. Are you sure there are no consumers running at the same time? Also, how many brokers were started in this timeframe? Thanks, Neha On Tue, Mar 25, 2014 at 12:55 PM, To

Re: dynamic configure per topic problem

2014-03-25 Thread 陈小军
Thanks. I got it. Best RegardsJerry-Original Message-From: "Neha Narkhede" To: "users@kafka.apache.org"; Cc: "陈小军"; Sent: 2014-03-26 (星期三) 02:08:46Subject: Re: dynamic configure per topic problem Filed https://issues.apache.org/jira/browse/KAFKA-1325 to track this. On Tue, Mar 25, 2014 at 9

Some brokers taking a long time to restart

2014-03-25 Thread Jie Li
Hello, This is Jie from Pinterest. We did a upgrade from 0.8.0 to 0.8.1 today with rolling restart, but found some brokers took a very long time to restart while some just restarted immediately. Here are some debug logs: [2014-03-25 23:17:13,381] DEBUG Adding index entry 41731335 => 248915234 t

producers limit

2014-03-25 Thread Kane Kane
Is there a recommended cap for the concurrent producers threads? We plan to have around 4000 connections across cluster writing to kafka, i assume there shouldn't be any performance implications related to that? Thanks.

Re: Separate broker replication traffic from producer/consumer traffic

2014-03-25 Thread Jay Kreps
No not at the moment. Are you seeing a problem that this would resolve? -Jay On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok wrote: > Hi all, > > Is there any way to configure the brokers such that producers & consumers > are talking via IP1, while the brokers are replicating between themselves > us

Separate broker replication traffic from producer/consumer traffic

2014-03-25 Thread Otto Mok
Hi all, Is there any way to configure the brokers such that producers & consumers are talking via IP1, while the brokers are replicating between themselves using IP2? I see there are broker settings for host.name and advertised.host.name, but it doesn't look like these settings does what I'm lo

ZooKeeper connect/disconnect pattern

2014-03-25 Thread Tom Amon
Again thank you for your patience Is the following pattern normal for a broker that is booting? This is from my zookeeper log. It seems to connect and disconnect multiple times in rapid succession. The last message is a disconnect message with no subsequent connect. Other zookeeper boxes don't

Can I delete old log files manually?

2014-03-25 Thread Tom Amon
My apologies for mail bombing the list. I'm banging my head against a production issue. In short, can I delete old log and index files manually? I have two brokers out of five that are hosting ~1200 partitions each, though based on my understanding from a previous email they should really only be

Re: KeeperException

2014-03-25 Thread Tom Amon
Maybe I'm reading it wrong but doesn't the error indicate that the node exists? That one is ok. It just means that a zk path doesn't exist. In this particular case, the path is not expected to always exist. Thanks, Jun

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Neha Narkhede
Here are the remaining issuesstill in progress. I believe we should be able to do a release mid April. Thanks, Neha On Tue, Mar 25, 2014 at 10:28 AM, Bae

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Bae, Jae Hyeon
Do you have any ETA for 0.8.1.1? On Tue, Mar 25, 2014 at 9:53 AM, Neha Narkhede wrote: > You are probably hitting https://issues.apache.org/jira/browse/KAFKA-1317. > We are trying to fix it in time for 0.8.1.1. > > Thanks, > Neha > > > On Tue, Mar 25, 2014 at 9:45 AM, Bae, Jae Hyeon > wrote: >

Re: dynamic configure per topic problem

2014-03-25 Thread Neha Narkhede
Filed https://issues.apache.org/jira/browse/KAFKA-1325 to track this. On Tue, Mar 25, 2014 at 9:31 AM, Neha Narkhede wrote: > What's the unit of 'retention.ms', second, minute or this is a bug? > > Our documentation is a little confusing on the log configs. See this- > > [image: Inline image 1]

Re: Adding replicas to existing topic

2014-03-25 Thread Neha Narkhede
Good to hear! We would like to add a tool to increase replication factor in the future, so this is just a stop-gap :) On Tue, Mar 25, 2014 at 9:49 AM, Marc Labbe wrote: > ... and it works like a charm :-) > > > On Tue, Mar 25, 2014 at 12:28 PM, Marc Labbe wrote: > > > Great! I'll give it a try

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Neha Narkhede
You are probably hitting https://issues.apache.org/jira/browse/KAFKA-1317. We are trying to fix it in time for 0.8.1.1. Thanks, Neha On Tue, Mar 25, 2014 at 9:45 AM, Bae, Jae Hyeon wrote: > ZkEventThread is blocked with the following stack trace: > > "ZkClient-EventThread-18-localhost:2181" da

Re: Adding replicas to existing topic

2014-03-25 Thread Marc Labbe
... and it works like a charm :-) On Tue, Mar 25, 2014 at 12:28 PM, Marc Labbe wrote: > Great! I'll give it a try. > > cheers, > marc > > > On Fri, Mar 21, 2014 at 1:10 PM, Neha Narkhede wrote: > >> Marc, >> >> I included the notes on increasing replication factor in the docs. It >> seems >> to

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Bae, Jae Hyeon
ZkEventThread is blocked with the following stack trace: "ZkClient-EventThread-18-localhost:2181" daemon prio=5 tid=7fb31b95c000 nid=0x1194a6000 waiting on condition [1194a5000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <7c220180

Re: Memory consumption in Kafka

2014-03-25 Thread Neha Narkhede
Hi LCassa, That is expected. Most modern operating systems reserve all free memory as "pagecache". Since Kafka reads/writes to the filesystem sequentially at high throughput, most of the page cache is used up at all times. That is good as well as expected. This is different from the memory footpr

Re: dynamic configure per topic problem

2014-03-25 Thread Neha Narkhede
What's the unit of 'retention.ms', second, minute or this is a bug? Our documentation is a little confusing on the log configs. See this- [image: Inline image 1] The log property is in ms but the server default it maps to is in minutes. Same is true for segment.ms as well. We could either improv

Re: Adding replicas to existing topic

2014-03-25 Thread Marc Labbe
Great! I'll give it a try. cheers, marc On Fri, Mar 21, 2014 at 1:10 PM, Neha Narkhede wrote: > Marc, > > I included the notes on increasing replication factor in the docs. It seems > to work fine locally though I haven't tested it in a production setting. > Here you go - > > http://kafka.apach

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Bae, Jae Hyeon
Nope, linux doesn't work. Let me debug why it's not triggered. On Tue, Mar 25, 2014 at 9:21 AM, Bae, Jae Hyeon wrote: > Hm... I cannot reproduce in my local, I downloaded kafka_2.8.0-0.8.1 > package but it didn't work. Let me try in my linux machine. > > > On Mon, Mar 24, 2014 at 6:11 PM, Neha

Re: Reinstating ephemeral nodes and watchers on zk session timeout

2014-03-25 Thread Bae, Jae Hyeon
Hm... I cannot reproduce in my local, I downloaded kafka_2.8.0-0.8.1 package but it didn't work. Let me try in my linux machine. On Mon, Mar 24, 2014 at 6:11 PM, Neha Narkhede wrote: > I think you are trying to introduce a session expiration, then could you > try to do the following and see if y

Re: dynamic configure per topic problem

2014-03-25 Thread Jun Rao
The unit is milli-sec. The deletion only happens when the log segment is rolled though. Thanks, Jun On Tue, Mar 25, 2014 at 12:59 AM, 陈小军 wrote: > Hi, >Today I test the dynamic configure for each topic, I set the ' > retention.ms' attribution for one topic, > [irt...@xseed170.kdev

Re: KeeperException

2014-03-25 Thread Jun Rao
That one is ok. It just means that a zk path doesn't exist. In this particular case, the path is not expected to always exist. Thanks, Jun On Tue, Mar 25, 2014 at 12:09 AM, Tom Amon wrote: > Whenever a session is expired by ZooKeeper I see the following messages > (one per consumer I think) i

Re: Log size and retention

2014-03-25 Thread Jun Rao
Yes, on average, each broker will host (total partitions * replication factor) /brokers partitions. Thanks, Jun On Tue, Mar 25, 2014 at 12:04 AM, Tom Amon wrote: > Not really, it doesn't say anything about replication. Should I assume that > replication follows the same rules? In other words,

Re: Question about manual tracking of Offset

2014-03-25 Thread Jun Rao
While you are consuming messages, you should use MessageAndOffset.nextOffset() when saving offsets. Thanks, Jun On Mon, Mar 24, 2014 at 10:43 PM, Krishna Raj wrote: > Hi experts & Kafka Dev team, > > Have a very quick question and need your help in designing a consumer. I am > trying to keep t

dynamic configure per topic problem

2014-03-25 Thread 陈小军
Hi, Today I test the dynamic configure for each topic, I set the 'retention.ms' attribution for one topic, [irt...@xseed170.kdev bin]$ ./kafka-topics.sh --describe --zookeeper 10.96.250.215:10013/nelo2-kafka --topic nelo2-crash-logs Topic:nelo2-crash-logs PartitionCount:6Re

KeeperException

2014-03-25 Thread Tom Amon
Whenever a session is expired by ZooKeeper I see the following messages (one per consumer I think) in the ZooKeeper log: 2014-03-25 00:05:12,953 - INFO [ProcessThread:-1:PrepRequestProcessor@419] - Got user-level KeeperException when processing sessionid:0x344f675fcee0164 type:create cxid:0x7566

Re: Log size and retention

2014-03-25 Thread Tom Amon
Not really, it doesn't say anything about replication. Should I assume that replication follows the same rules? In other words, the maximum number of partitions on a single node is (partitions/brokers * replication factor)? Does the f