I don't know if that is your problem, but I had this output when my brokers
couldn't talk to each others...

The zookeeper were using the FQDN but my brokers didn't know the FQDN of
the other brokers...

If you look at you brokers info in zk (get /brokers/ids/#ID_OF_BROKER) can
you ping/connect to the value of the key "host" from your other brokers?



François Langelier
Étudiant en génie Logiciel - École de Technologie Supérieure
<http://www.etsmtl.ca/>
Capitaine Club Capra <http://capra.etsmtl.ca/>
VP-Communication - CS Games <http://csgames.org> 2014
Jeux de Génie <http://www.jdgets.com/> 2011 à 2014
Argentier Fraternité du Piranha <http://fraternitedupiranha.com/> 2012-2014
Comité Organisateur Olympiades ÉTS 2012
Compétition Québécoise d'Ingénierie 2012 - Compétition Senior


On 9 July 2014 15:17, hsy...@gmail.com <hsy...@gmail.com> wrote:

> I have the same problem. I didn't dig deeper but I saw this happen when I
> launch kafka in daemon mode. I found the daemon mode is just launch kafka
> with nohup. Not quite clear why this happen.
>
>
> On Wed, Jul 9, 2014 at 9:59 AM, Lung, Paul <pl...@ebay.com> wrote:
>
> > Yup. In fact, I just ran the test program again while the Kafak broker is
> > still running, using the same user of course. I was able to get up to 10K
> > connections with the test program. The test program uses the same java
> NIO
> > library that the broker does. So the machine is capable of handling that
> > many connections. The only issue I saw was that the NIO
> > ServerSocketChannel is a bit slow at accepting connections when the total
> > connection goes around 4K, but this could be due to the fact that I put
> > the ServerSocketChannel in the same Selector as the 4K SocketChannels. So
> > sometimes on the client side, I see:
> >
> > java.io.IOException: Connection reset by peer
> >         at sun.nio.ch.FileDispatcher.write0(Native Method)
> >         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> >         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122)
> >         at sun.nio.ch.IOUtil.write(IOUtil.java:93)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352)
> >         at FdTest$ClientThread.run(FdTest.java:108)
> >
> >
> > But all I have to do is sleep for a bit on the client, and then retry
> > again. However, 4K does seem like a magic number, since that¹s seems to
> be
> > the number that the Kafka broker machine can handle before it gives me
> the
> > ³Too Many Open Files² error and eventually crashes.
> >
> > Paul Lung
> >
> > On 7/8/14, 9:29 PM, "Jun Rao" <jun...@gmail.com> wrote:
> >
> > >Does your test program run as the same user as Kafka broker?
> > >
> > >Thanks,
> > >
> > >Jun
> > >
> > >
> > >On Tue, Jul 8, 2014 at 1:42 PM, Lung, Paul <pl...@ebay.com> wrote:
> > >
> > >> Hi Guys,
> > >>
> > >> I¹m seeing the following errors from the 0.8.1.1 broker. This occurs
> > >>most
> > >> often on the Controller machine. Then the controller process crashes,
> > >>and
> > >> the controller bounces to other machines, which causes those machines
> to
> > >> crash. Looking at the file descriptors being held by the process, it¹s
> > >>only
> > >> around 4000 or so(looking at . There aren¹t a whole lot of connections
> > >>in
> > >> TIME_WAIT states, and I¹ve increased the ephemeral port range to
> ³16000
> > >>­
> > >> 64000² via "/proc/sys/net/ipv4/ip_local_port_range². I¹ve written a
> Java
> > >> test program to see how many sockets and files I can open. The socket
> is
> > >> definitely limited by the ephemeral port range, which was around 22K
> at
> > >>the
> > >> time. But I
> > >> can open tons of files, since the open file limit of the user is set
> to
> > >> 100K.
> > >>
> > >> So given that I can theoretically open 48K sockets and probably 90K
> > >>files,
> > >> and I only see around 4K total for the Kafka broker, I¹m really
> > >>confused as
> > >> to why I¹m seeing this error. Is there some internal Kafka limit that
> I
> > >> don¹t know about?
> > >>
> > >> Paul Lung
> > >>
> > >>
> > >>
> > >> java.io.IOException: Too many open files
> > >>
> > >>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> > >>
> > >>         at
> > >>
> >
> >>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:16
> > >>3)
> > >>
> > >>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
> > >>
> > >>         at kafka.network.Acceptor.run(SocketServer.scala:154)
> > >>
> > >>         at java.lang.Thread.run(Thread.java:679)
> > >>
> > >> [2014-07-08 13:07:21,534] ERROR Error in acceptor
> > >>(kafka.network.Acceptor)
> > >>
> > >> java.io.IOException: Too many open files
> > >>
> > >>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> > >>
> > >>         at
> > >>
> >
> >>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:16
> > >>3)
> > >>
> > >>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
> > >>
> > >>         at kafka.network.Acceptor.run(SocketServer.scala:154)
> > >>
> > >>         at java.lang.Thread.run(Thread.java:679)
> > >>
> > >> [2014-07-08 13:07:21,563] ERROR [ReplicaFetcherThread-3-2124488],
> Error
> > >> for partition [bom__021____active_80__32__mini____activeitem_lvs_qn,0]
> > >>to
> > >> broker 2124488:class kafka.common.NotLeaderForPartitionException
> > >> (kafka.server.ReplicaFetcherThread)
> > >>
> > >> [2014-07-08 13:07:21,558] FATAL [Replica Manager on Broker 2140112]:
> > >>Error
> > >> writing to highwatermark file:  (kafka.server.ReplicaManager)
> > >>
> > >> java.io.FileNotFoundException:
> > >>
> >
> >>/ebay/cronus/software/cronusapp_home/kafka/kafka-logs/replication-offset-
> > >>checkpoint.tmp
> > >> (Too many open files)
> > >>
> > >>         at java.io.FileOutputStream.open(Native Method)
> > >>
> > >>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
> > >>
> > >>         at java.io.FileOutputStream.<init>(FileOutputStream.java:160)
> > >>
> > >>         at java.io.FileWriter.<init>(FileWriter.java:90)
> > >>
> > >>         at
> > >>kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
> > >>
> > >>         at
> > >>
> >
> >>kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(Rep
> > >>licaManager.scala:447)
> > >>
> > >>         at
> > >>
> >
> >>kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(Rep
> > >>licaManager.scala:444)
> > >>
> > >>         at
> > >>
> >
> >>scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(Trav
> > >>ersableLike.scala:772)
> > >>
> > >>         at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
> > >>
> > >>         at
> > >>
> >
> >>scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala
> > >>:771)
> > >>
> > >>         at
> > >>
> >
> >>kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala
> > >>:444)
> > >>
> > >>         at
> > >>
> >
> >>kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:
> > >>94)
> > >>
> > >>         at
> > >>kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100)
> > >>
> > >>         at
> > >>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> > >>
> > >>         at
> > >>
> >
> >>java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351
> > >>)
> > >>
> > >>         at
> > >>java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> > >>
> > >>         at
> > >>
> >
> >>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.acce
> > >>ss$201(ScheduledThreadPoolExecutor.java:165)
> > >>
> > >>         at
> > >>
> >
> >>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(
> > >>ScheduledThreadPoolExecutor.java:267)
> > >>
> > >>         at
> > >>
> >
> >>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
> > >>:1110)
> > >>
> > >>         at
> > >>
> >
> >>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
> > >>a:603)
> > >>
> > >>         at java.lang.Thread.run(Thread.java:679)
> > >>
> > >>
> > >>
> > >>
> >
> >
>

Reply via email to