The kafka-server-start.sh script doesn't have the mentioned GC settings and heap size configured. However, probably doing that is a good idea.
Thanks, Neha On Tue, Mar 26, 2013 at 9:47 AM, Yonghui Zhao <zhaoyong...@gmail.com> wrote: > kafka server is started by bin/kafka-server-start.sh. No gc setting. > 在 2013-3-26 下午11:40,"Neha Narkhede" <neha.narkh...@gmail.com>写道: > >> Did you have a gc pause around that time on the server ? What are your >> server's current gc settings ? >> >> Thanks, >> Neha >> >> On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <zhaoyong...@gmail.com> >> wrote: >> > Thanks Neha, btw have you seen this exception. We didn't restart any >> > service it happens in deep night. >> > >> > java.lang.RuntimeException: A broker is already registered on the path >> > /brokers/ids/0. This probably indicates that you either have configured a >> > brokerid that is already in use, or else you have shutdown this broker >> and >> > restarted it faster than the zookeeper timeout so it appears to be >> > re-registering. >> > at >> > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) >> > at >> > >> kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) >> > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for >> broker >> > 0 (kafka.server.KafkaZooKeeper) >> > [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 >> > (kafka.server.KafkaZooKeeper) >> > [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: >> > 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: >> 127.0.0.1-1364227372971: >> > 127.0.0.1:9093 (kafka.utils.ZkUtils$) >> > [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New session >> > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf >> ] >> > (org.I0Itec.zkclient.ZkEventThread) >> > java.lang.RuntimeException: A broker is already registered on the path >> > /brokers/ids/0. This probably indicates that you either have configured a >> > brokerid that is already in use, or else you have shutdown this broker >> and >> > restarted it faster than the zookeeper timeout so it appears to be >> > re-registering. >> > at >> > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) >> > at >> > >> kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) >> > >> > >> > >> > 2013/3/26 Neha Narkhede <neha.narkh...@gmail.com> >> > >> >> That really depends on your consumer application's memory allocation >> >> patterns. If it is a thin wrapper over a Kafka consumer, I would imagine >> >> you can get away with using CMS for the tenured generation and parallel >> >> collector for the new generation with a small heap like 1gb or so. >> >> >> >> Thanks, >> >> Neha >> >> >> >> On Monday, March 25, 2013, Yonghui Zhao wrote: >> >> >> >> > Any suggestion on consumer side? >> >> > 在 2013-3-25 下午9:49,"Neha Narkhede" <neha.narkh...@gmail.com >> <javascript:;> >> >> > >写道: >> >> > >> >> > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, >> new >> >> > gen >> >> > > 256 MB, CMS collector with occupancy of 70%. >> >> > > >> >> > > Thanks, >> >> > > Neha >> >> > > >> >> > > On Sunday, March 24, 2013, Yonghui Zhao wrote: >> >> > > >> >> > > > Hi Jun, >> >> > > > >> >> > > > I used kafka-server-start.sh to start kafka, there is only one jvm >> >> > > setting >> >> > > > "-Xmx512M“ >> >> > > > >> >> > > > Do you have some recommend GC setting? Usually our sever has >> 32GB >> >> or >> >> > > 64GB >> >> > > > RAM. >> >> > > > >> >> > > > 2013/3/22 Jun Rao <jun...@gmail.com> >> >> > > > >> >> > > > > A typical reason for many rebalancing is the consumer side GC. >> If >> >> so, >> >> > > you >> >> > > > > will see logs in the consume saying sth like "expired session" >> for >> >> > ZK. >> >> > > > > Occasional rebalances are fine. Too many rebalances can slow >> down >> >> the >> >> > > > > consumption and you will need to tune your GC setting. >> >> > > > > >> >> > > > > Thanks, >> >> > > > > >> >> > > > > Jun >> >> > > > > >> >> > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao < >> >> > zhaoyong...@gmail.com >> >> > > > > >wrote: >> >> > > > > >> >> > > > > > Yes, before consumer exception: >> >> > > > > > >> >> > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] >> >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing >> >> > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> >> > > > > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] >> >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing >> >> > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> >> > > > > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] >> FetchRunnable-0 >> >> > > start >> >> > > > > > fetching topic: sms part: 0 offset: 43667888259 from >> >> > 127.0.0.1:9093 >> >> > > > > > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in >> >> > > > multifetch >> >> > > > > > due to socket error: >> >> > > > > > java.nio.channels.*ClosedByInterruptException* >> >> > > > > > at >> java.nio.channels.spi.*AbstractInterruptibleChannel* >> >> > > > > > .end(AbstractInterruptibleChannel.java:201) >> >> > > > > > >> >> > > > > > >> >> > > > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] >> >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing >> >> > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> >> > > > > > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] >> FetchRunnable-0 >> >> > > start >> >> > > > > > fetching topic: sms part: 0 offset: 43667888259 from >> >> > 127.0.0.1:9093 >> >> > > > > > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] >> >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing >> >> consume*r >> >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> >> > > > > > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in >> >> > > > multifetch >> >> > > > > > due to socket error: >> >> > > > > > java.nio.channels.*ClosedByInterruptException* >> >> > > > > > at >> java.nio.channels.spi.*AbstractInterruptibleChannel* >> >> > > > > > .end(AbstractInterruptibleChannel.java:201) >> >> > > > > > >> >> > > > > > >> >> > > > > > So you think it is normal? How can we avoid this exception? >> >> > > > > > >> >> > > > > > I used 4 partitions in kafka, use only 1 partition? >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > 2013/3/22 Jun Rao <jun...@gmail.com> >> >> > > > > > >> >> > > > > > > Do you see any rebalances in the consumer? Each rebalance >> will >> >> > > > > interrupt >> >> > > > > > > existing fetcher threads first. >> >> > > > > > > >> >> > > > > > > Thanks, >> >> > > > > > > >> >> > > > > > > Jun >> >> > > > > > > >> >> > > > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao < >> >> > > zhaoyong...@gmail.com >> >> > >> >> >>