Could you try again by setting the config to true?
On Mon, Aug 11, 2014 at 3:14 PM, Tanneru, Raj <rstann...@ebay.com> wrote: > Hi Guozhang, > > I didn't set enable.controlled.shutdown to true. Yes I am shutting down 1 > broker at a time slowly. However I begin the test(2 clients producing > messages) long time after taking down the brokers. I see the below debug > message on live broker once in a while. > > [2014-08-11 15:09:56,078] DEBUG [KafkaApi-1] Error while fetching metadata > for [item_topic_0,0]. Possible cause: null (kafka.server.KafkaApis) > > And on the broker that was shut down I see below INFO messages. There are > no error messages though. > > [2014-08-08 11:16:10,516] INFO [Kafka Server 4], shutting down > (kafka.server.KafkaServer) > [2014-08-08 11:16:10,518] INFO [Socket Server on Broker 4], Shutting down > (kafka.network.SocketServer) > [2014-08-08 11:16:10,735] INFO [Socket Server on Broker 4], Shutdown > completed (kafka.network.SocketServer) > [2014-08-08 11:16:10,737] INFO [Kafka Request Handler on Broker 4], > shutting down (kafka.server.KafkaRequestHandlerPool) > [2014-08-08 11:16:10,741] INFO [Kafka Request Handler on Broker 4], shut > down completely (kafka.server.KafkaRequestHandlerPool) > [2014-08-08 11:16:10,922] INFO [Replica Manager on Broker 4]: Shut down > (kafka.server.ReplicaManager) > [2014-08-08 11:16:10,923] INFO [ReplicaFetcherManager on broker 4] > shutting down (kafka.server.ReplicaFetcherManager) > [2014-08-08 11:16:10,925] INFO [ReplicaFetcherManager on broker 4] > shutdown completed (kafka.server.ReplicaFetcherManager) > [2014-08-08 11:16:10,927] INFO [Replica Manager on Broker 4]: Shutted down > completely (kafka.server.ReplicaManager) > [2014-08-08 11:16:10,949] INFO [Kafka Server 4], shut down completed > (kafka.server.KafkaServer) > > Thanks, > Raj Tanneru > > -----Original Message----- > From: Guozhang Wang [mailto:wangg...@gmail.com] > Sent: Monday, August 11, 2014 11:27 AM > To: users@kafka.apache.org > Subject: Re: Error while producing messages > > Hi Raj, > > I have a couple of more questions for you: > > 1. On the server configs, did you set enable.controlled.shutdown to true > or not? > 2. When you shutdown just one broker, did you see any errors? Here I am > assuming you are not shutting down brokers too quickly, but shutdown one > broker at a time, wait for the systems to resume back to normal state (i.e. > no more transient errors or warnings), and then the next one. > > Guozhang > > > On Mon, Aug 11, 2014 at 10:08 AM, Tanneru, Raj <rstann...@ebay.com> wrote: > > > I shutdown 3 out 5. With 2 brokers I start seeing failures after > > successfully sending some messages. Not all messages are failing. I > > wanted to understand the case/s when we log below message? If you > > notice there is difference in send/receive buffer size of actual and > > requested. I don’t see this message in log for successful messages. Is > it normal? > > > > [2014-08-08 16:47:35,179] DEBUG Accepted connection from > > /10.254.243.142 on /10.66.107.231:9092. sendBufferSize > [actual|requested]: > > [131071|1048576] recvBufferSize [actual|requested]: [131071|1048576] > > (kafka.network.Acceptor) > > > > -----Original Message----- > > From: Guozhang Wang [mailto:wangg...@gmail.com] > > Sent: Monday, August 11, 2014 10:01 AM > > To: users@kafka.apache.org > > Subject: Re: Error while producing messages > > > > I am not sure I understand completely. How many brokers did you > > shutdown out of the total number of 5 brokers? With single-partition > > topics, if the replication factor is 3, this partition will be hosted on > 3 brokers. > > > > Guozhang > > > > > > On Mon, Aug 11, 2014 at 9:46 AM, Tanneru, Raj <rstann...@ebay.com> > wrote: > > > > > Sorry I should have provided this information. All my topics have > > > single partition meaning there are 2 nodes that already have topic > > > partition. It's just that the node having 3rd partition is down. So > > > if a message fails there is no reason why other message should > > > succeed, unless there is network saturation, broker resource > exhaustion etc. > > > > > > Thanks, > > > Raj Tanneru > > > > > > -----Original Message----- > > > From: Guozhang Wang [mailto:wangg...@gmail.com] > > > Sent: Sunday, August 10, 2014 10:14 PM > > > To: users@kafka.apache.org > > > Subject: Re: Error while producing messages > > > > > > With replication of 3, partitions that happen to be only hosted on > > > those 3 shutdown brokers would be not available, unless controlled > > shutdown is used. > > > > > > Guozhang > > > > > > > > > On Sat, Aug 9, 2014 at 10:59 AM, Tanneru, Raj <rstann...@ebay.com> > > wrote: > > > > > > > Guozhang, > > > > > > > > Our replication factor is 3. > > > > > > > > > > > > Thanks, > > > > Raj Tanneru > > > > > > > > > > > > -------- Original message -------- > > > > From: Guozhang Wang > > > > Date:08/09/2014 10:21 AM (GMT-08:00) > > > > To: users@kafka.apache.org > > > > Subject: Re: Error while producing messages > > > > > > > > Raj, > > > > > > > > What is your replication factor setting in kafka brokers? > > > > > > > > Guozhang > > > > > > > > > > > > On Fri, Aug 8, 2014 at 5:02 PM, Tanneru, Raj <rstann...@ebay.com> > > wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > I am trying to do capacity sizing estimate for our kafka cluster. > > > > > I started with 5 broker cluster and 3 node zk. Used a simple > > > > > java based producer to send messages to 5 topics that are > > > > > created in the cluster. I used 2 client machines with 100 worker > > > > > threads each sending messages continuously. I didn't see any > > > > > exceptions or issues when all 5 brokers > > > > are > > > > > up. When I take down 3 brokers out of 5, and 4 topics out of 5. > > > > > I am > > > > seeing > > > > > below in broker logs. > > > > > > > > > > > > > > > > > > > > [2014-08-08 16:47:34,977] DEBUG Closing connection from / > > > > > 10.254.243.142:33944 (kafka.network.Processor) > > > > > > > > > > [2014-08-08 16:47:35,179] DEBUG Accepted connection from > > > > > /10.254.243.142 on /10.66.107.231:9092. sendBufferSize > > > [actual|requested]: > > > > > [131071|1048576] recvBufferSize [actual|requested]: > > > > > [131071|1048576] > > > > > (kafka.network.Acceptor) > > > > > > > > > > [2014-08-08 16:47:35,179] DEBUG Processor 860 listening to new > > > > > connection from /10.254.243.142:33947 (kafka.network.Processor) > > > > > > > > > > [2014-08-08 16:47:35,179] DEBUG [KafkaApi-1] Error while > > > > > fetching > > > > metadata > > > > > for [item_topic_0,0]. Possible cause: null > > > > > (kafka.server.KafkaApis) > > > > > > > > > > [2014-08-08 16:47:35,207] INFO Closing socket connection to / > > > > > 10.254.243.142. (kafka.network.Processor) > > > > > > > > > > [2014-08-08 16:47:35,207] DEBUG Closing connection from / > > > > > 10.254.243.142:33947 (kafka.network.Processor) > > > > > > > > > > [2014-08-08 16:47:35,294] DEBUG Accepted connection from > > > > > /10.254.243.142 on /10.66.107.231:9092. sendBufferSize > > > [actual|requested]: > > > > > [131071|1048576] recvBufferSize [actual|requested]: > > > > > [131071|1048576] > > > > > (kafka.network.Acceptor) > > > > > > > > > > > > > > > > > > > > On the client machine(producer) I am seeing below error. Again > > > > > only one topic and 2 broker nodes are running. > > > > > > > > > > > > > > > > > > > > Exception in thread "pool-1-thread-107" > > > > > kafka.common.FailedToSendMessageException: Failed to send > > > > > messages after > > > > 3 > > > > > tries. > > > > > > > > > > Sending message 181171284512 for topic item_topic_0 > > > > > > > > > > at > > > > > > > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler. > > > > sc > > > > ala:90) > > > > > > > > > > at kafka.producer.Producer.send(Producer.scala:76) > > > > > > > > > > at > > > > > kafka.javaapi.producer.Producer.send(Producer.scala:33) > > > > > > > > > > at > > > > > > > > > com.ebay.cassini.feeder.nrt.kafka.producer.WorkerThread.produceKaf > > > > ka > > > > Me > > > > ssage(WorkerThread.java:33) > > > > > > > > > > at > > > > > > > > > com.ebay.cassini.feeder.nrt.kafka.producer.WorkerThread.run(Worker > > > > Th > > > > re > > > > ad.java:27) > > > > > > > > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolE > > > > xe > > > > cu > > > > tor.java:886) > > > > > > > > > > at > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor. > > > > java:908) > > > > > > > > > > at java.lang.Thread.run(Thread.java:662) > > > > > > > > > > Exception in thread "pool-1-thread-108" > > > > > kafka.common.FailedToSendMessageException: Failed to send > > > > > messages after > > > > 3 > > > > > tries. > > > > > > > > > > Sending message 221252534837 for topic item_topic_0 > > > > > > > > > > at > > > > > > > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler. > > > > sc > > > > ala:90) > > > > > > > > > > at kafka.producer.Producer.send(Producer.scala:76) > > > > > > > > > > at > > > > > kafka.javaapi.producer.Producer.send(Producer.scala:33) > > > > > > > > > > at > > > > > > > > > com.ebay.cassini.feeder.nrt.kafka.producer.WorkerThread.produceKaf > > > > ka > > > > Me > > > > ssage(WorkerThread.java:33) > > > > > > > > > > at > > > > > > > > > com.ebay.cassini.feeder.nrt.kafka.producer.WorkerThread.run(Worker > > > > Th > > > > re > > > > ad.java:27) > > > > > > > > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolE > > > > xe > > > > cu > > > > tor.java:886) > > > > > > > > > > at > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor. > > > > java:908) > > > > > > > > > > at java.lang.Thread.run(Thread.java:662) > > > > > > > > > > > > > > > > > > > > Not all messages are erroring out. Only a few messages are failing. > > > > > Any idea what could be going on? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Raj Tanneru > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > > > > > > > -- > > -- Guozhang > > > > > > -- > -- Guozhang > -- -- Guozhang