Here is what I see on Kafka log: [2015-05-14 04:11:27,752] ERROR Closing socket for /10.180.195.32 because of error (kafka.network.Processor)
java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at kafka.utils.Utils$.read(Utils.scala:375) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:347) at kafka.network.Processor.run(SocketServer.scala:245) at java.lang.Thread.run(Thread.java:745) [2015-05-14 04:11:27,753] INFO Closing socket connection to /10.180.195.32. (kafka.network.Processor) [2015-05-14 04:16:06,537] INFO Closing socket connection to /10.180.195.32. (kafka.network.Processor) [2015-05-14 04:16:06,604] INFO Closing socket connection to /10.180.195.32. (kafka.network.Processor) [2015-05-14 04:16:32,370] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:16:32,452] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:16:32,810] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:16:32,931] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:36:40,586] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:39:49,016] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:43:38,166] INFO Closing socket connection to /10.180.195.32. (kafka.network.Processor) [2015-05-14 04:43:38,392] INFO [ReplicaFetcherManager on broker 1018019533] Removed fetcher for partitions [argos-parser,0],[argos-raw,0] (kafka.server.ReplicaFetcherManager) [2015-05-14 04:43:40,746] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:43:40,855] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) [2015-05-14 04:43:40,957] INFO Closing socket connection to /10.180.195.33. (kafka.network.Processor) On Thu, May 14, 2015 at 4:55 AM, Shekar Tippur <ctip...@gmail.com> wrote: > Here is the complete log: > > http://pastebin.com/nX7twETm > > Interesting, I see a leader not available exception instead of the earlier > one. > > ./container_1431601903660_0001_01_000002/samza-container-0.log:2015-05-14 > 04:53:41 BrokerPartitionInfo [WARN] Error while fetching metadata partition > 0 leader: none replicas: 1018019532 (sprdargas402.corp.intuit.net:6667) isr: > isUnderReplicated: true for topic partition > [__samza_checkpoint_ver_1_for_Argos_1,0]: [class > kafka.common.LeaderNotAvailableException] > > - Shekar > > On Wed, May 13, 2015 at 7:52 PM, Naveen S <navg...@gmail.com> wrote: > >> Hey Shekar, >> Can you paste the entire stacktrace/log? Where there any other errors ? >> On Wed, May 13, 2015 at 6:04 PM Shekar Tippur <ctip...@gmail.com> wrote: >> >> > Hello, >> > >> > I seem to come across a issue with replication. We have 2 nodes where >> Kafka >> > and yarn run. >> > >> > We have enabled replication factor on Kafka (Replication factor = 2). >> For >> > testing redundancy, we shutdown broker01 server. >> > On the yarn application logs, we see the >> > exception kafka.common.ReplicaNotAvailableException >> > >> > Incoming topic: >> > >> > /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --topic raw >> > --describe >> > >> > Topic:raw PartitionCount:1 ReplicationFactor:2 Configs: >> > >> > Topic: argos-raw Partition: 0 Leader: 1018019533 Replicas: >> > 1018019533,1018019532 Isr: 1018019533,1018019532 >> > >> > Out going topic: >> > >> > /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --topic >> parser >> > --describe >> > >> > Topic:parser PartitionCount:1 ReplicationFactor:2 Configs: >> > >> > Topic: argos-parser Partition: 0 Leader: 1018019533 Replicas: >> > 1018019533,1018019532 Isr: 1018019533,1018019532 >> > >> > Any idea on why this could be happening? >> > >> > - Shekar >> > >> > >