Re: Kafka server error

Jiang Jacky Thu, 10 Oct 2013 21:55:24 -0700

Yes, it just says "INFO Reconnect due to socket error"
But why and how come it comes? my zookeeper and storm have no any problem
to collaborate each other.



2013/10/11 Jun Rao <jun...@gmail.com>

> The log you posted for the second broker didn't say why it crashed. Is that
> all you got?
>
> Thanks,
>
> Jun
>
>
> On Thu, Oct 10, 2013 at 9:22 PM, Jiang Jacky <jiang0...@gmail.com> wrote:
>
> > *Hi, Guys,*
> > *I am currently running into the kafka server issue. *
> > *I have a 5 nodes cluster and zookeeper running without any problem.
> when I
> > manually boot each node by using* "*JMX_PORT=9997
> bin/kafka-server-start.sh
> > config/server-x.properties &*" command.
> >
> > *The scenario is:*
> > *Then, first node, it can be booted.*
> > *Once I boot the second node, it is crashed, the error is below:*
> >
> > [2013-10-11 04:02:17,200] INFO [Replica Manager on Broker 0]: Handling
> > LeaderAndIsr request
> >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:30416;CorrelationId:5;ClientId:id_0-host_null-port_9092;PartitionState:(test-kafka,0)
> > ->
> >
> >
> (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:90,ControllerEpoch:30411),ReplicationFactor:1),AllReplicas:1);Leaders:id:1,host:localhost,port:9092
> > (kafka.server.ReplicaManager)
> > [2013-10-11 04:02:17,204] WARN No previously checkpointed highwatermark
> > value found for topic test-kafka partition 0. Returning 0 as the
> > highwatermark (kafka.server.HighwaterMarkCheckpoint)
> > [2013-10-11 04:02:17,205] INFO [ReplicaFetcherManager on broker 0]
> Removing
> > fetcher for partition [test-kafka,0] (kafka.server.ReplicaFetcherManager)
> > [2013-10-11 04:02:17,214] INFO [Kafka Log on Broker 0], Truncated log
> > segment /tmp/kafka-logs/test-kafka-0/00000000000000000000.log to target
> > offset 0 (kafka.log.Log)
> > [2013-10-11 04:02:17,235] INFO [ReplicaFetcherManager on broker 0] Adding
> > fetcher for partition [test-kafka,0], initOffset 0 to broker 1 with
> > fetcherId 0 (kafka.server.ReplicaFetcherManager)
> > [2013-10-11 04:02:17,236] INFO [Replica Manager on Broker 0]: Handled
> > leader and isr request
> >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:30416;CorrelationId:5;ClientId:id_0-host_null-port_9092;PartitionState:(test-kafka,0)
> > ->
> >
> >
> (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:90,ControllerEpoch:30411),ReplicationFactor:1),AllReplicas:1);Leaders:id:1,host:localhost,port:9092
> > (kafka.server.ReplicaManager)
> > [2013-10-11 04:02:17,240] INFO [ReplicaFetcherThread-0-1], Starting
> >  (kafka.server.ReplicaFetcherThread)
> > [2013-10-11 04:02:17,266] INFO [Replica Manager on Broker 0]: Handling
> > LeaderAndIsr request
> >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:30416;CorrelationId:6;ClientId:id_0-host_null-port_9092;PartitionState:(test-kafka,0)
> > ->
> >
> >
> (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:91,ControllerEpoch:30416),ReplicationFactor:1),AllReplicas:1);Leaders:id:1,host:localhost,port:9092
> > (kafka.server.ReplicaManager)
> > [2013-10-11 04:02:17,267] INFO [ReplicaFetcherManager on broker 0]
> Removing
> > fetcher for partition [test-kafka,0] (kafka.server.ReplicaFetcherManager)
> > [2013-10-11 04:02:17,268] INFO [Kafka Log on Broker 0], Truncated log
> > segment /tmp/kafka-logs/test-kafka-0/00000000000000000000.log to target
> > offset 0 (kafka.log.Log)
> > [2013-10-11 04:02:17,268] INFO [ReplicaFetcherManager on broker 0] Adding
> > fetcher for partition [test-kafka,0], initOffset 0 to broker 1 with
> > fetcherId 0 (kafka.server.ReplicaFetcherManager)
> > [2013-10-11 04:02:17,269] INFO [Replica Manager on Broker 0]: Handled
> > leader and isr request
> >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:30416;CorrelationId:6;ClientId:id_0-host_null-port_9092;PartitionState:(test-kafka,0)
> > ->
> >
> >
> (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:91,ControllerEpoch:30416),ReplicationFactor:1),AllReplicas:1);Leaders:id:1,host:localhost,port:9092
> > (kafka.server.ReplicaManager)
> > [2013-10-11 04:02:17,269] ERROR [Kafka Request Handler 0 on Broker 0],
> > Exception when handling request (kafka.server.KafkaRequestHandler)
> > [2013-10-11 04:02:47,284] INFO Reconnect due to socket error:
> >  (kafka.consumer.SimpleConsumer)
> > java.net.SocketTimeoutException
> >         at
> > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> >         at
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> >         at
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> >         at kafka.utils.Utils$.read(Utils.scala:394)
> >         at
> >
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >         at
> > kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >         at
> >
> >
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> >         at
> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >         at
> > kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:109)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
> >         at
> >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:108)
> >         at
> >
> >
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96)
> >         at
> > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88)
> >         at
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > [2013-10-11 04:02:47,292] ERROR [Kafka Request Handler 1 on Broker 0],
> > Exception when handling request (kafka.server.KafkaRequestHandler)
> >
> > *Then I boot the third node until the last one, everything is good,
> except
> > of second node.*
> > *
> > *
> > *After that, I tried to stop server one by one, I first stopped the
> broken
> > node, then there is one of health node will show the same error as the
> > broken node, it is random. I stopped that broken node again, then there
> > will be another random node will be broken with the same error.*
> > *
> > *
> > *
> > *
> > *When I tried to produce message, it gives me the below errors:*
> >
> >
> > [2013-10-11 04:13:12,876] INFO Fetching metadata from broker
> > id:0,host:localhost,port:9092 with correlation id 15 for 1 topic(s)
> > Set(my-replicated-topic) (kafka.client.ClientUtils$)
> > [2013-10-11 04:13:12,876] INFO Connected to localhost:9092 for producing
> > (kafka.producer.SyncProducer)
> > [2013-10-11 04:13:12,886] INFO Disconnecting from localhost:9092
> > (kafka.producer.SyncProducer)
> > [2013-10-11 04:13:12,886] INFO Closing socket connection to /127.0.0.1.
> > (kafka.network.Processor)
> > [2013-10-11 04:13:12,887] WARN Error while fetching metadata
> > [{TopicMetadata for topic my-replicated-topic ->
> > No partition metadata for topic my-replicated-topic due to
> > kafka.common.LeaderNotAvailableException}] for topic
> [my-replicated-topic]:
> > class kafka.common.LeaderNotAvailableException
> >  (kafka.producer.BrokerPartitionInfo)
> > [2013-10-11 04:13:12,887] ERROR Failed to collate messages by topic,
> > partition due to: Failed to fetch topic metadata for topic:
> > my-replicated-topic (kafka.producer.async.DefaultEventHandler)
> > [2013-10-11 04:13:12,887] INFO Back off for 100 ms before retrying send.
> > Remaining retries = 0 (kafka.producer.async.DefaultEventHandler)
> > [2013-10-11 04:13:12,988] INFO Fetching metadata from broker
> > id:0,host:localhost,port:9092 with correlation id 16 for 1 topic(s)
> > Set(my-replicated-topic) (kafka.client.ClientUtils$)
> > [2013-10-11 04:13:12,989] INFO Connected to localhost:9092 for producing
> > (kafka.producer.SyncProducer)
> > [2013-10-11 04:13:12,999] INFO Disconnecting from localhost:9092
> > (kafka.producer.SyncProducer)
> > [2013-10-11 04:13:12,999] INFO Closing socket connection to /127.0.0.1.
> > (kafka.network.Processor)
> > [2013-10-11 04:13:13,000] WARN Error while fetching metadata
> > [{TopicMetadata for topic my-replicated-topic ->
> > No partition metadata for topic my-replicated-topic due to
> > kafka.common.LeaderNotAvailableException}] for topic
> [my-replicated-topic]:
> > class kafka.common.LeaderNotAvailableException
> >  (kafka.producer.BrokerPartitionInfo)
> > [2013-10-11 04:13:13,000] ERROR Failed to send requests for topics
> > my-replicated-topic with correlation ids in [9,16]
> > (kafka.producer.async.DefaultEventHandler)
> > [2013-10-11 04:13:13,001] ERROR Error in handling batch of 1 events
> > (kafka.producer.async.ProducerSendThread)
> > kafka.common.FailedToSendMessageException: Failed to send messages after
> 3
> > tries.
> >         at
> >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
> >         at
> >
> >
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)
> >         at
> >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)
> >         at
> >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)
> >         at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >         at
> >
> >
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)
> >         at
> > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)
> >
> > *I configured everything according to the documents.*
> > *I copied the setting from one of my nodes*
> >
> > broker.id=3
> >
> > ############################# Socket Server Settings
> > #############################
> >
> > port=9092
> >
> >
> > num.network.threads=2
> >
> > num.io.threads=2
> >
> > socket.send.buffer.bytes=1048576
> >
> > socket.receive.buffer.bytes=1048576
> >
> > socket.request.max.bytes=104857600
> >
> > log.dir=/tmp/kafka-logs
> >
> > num.partitions=1
> >
> > log.flush.interval.messages=10000
> >
> > log.flush.interval.ms=1000
> >
> > log.retention.hours=168
> >
> > log.segment.bytes=536870912
> >
> >
> > log.cleanup.interval.mins=1
> > zookeeper.connect=localhost:2181
> >
> >
> > zookeeper.connection.timeout.ms=1000000
> >
> >
> > kafka.metrics.polling.interval.secs=5
> > kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter
> > kafka.csv.metrics.dir=/tmp/kafka_metrics
> >
> > kafka.csv.metrics.reporter.enabled=false
> >
> > *Can some one tell me what happened?  Appreciate!*
> >
>

Re: Kafka server error

Reply via email to