I have tried changing the port to no avail. You can also see in the first email running a: `netstat -tulpn` produces output saying that nothing is using port 9999. Also seeing as how 90% of the time it works it really doesn't seem like other software would sometimes be booted, and sometimes not.
On Thu, Jun 29, 2017 at 1:51 AM, Tom Bentley <t.j.bent...@gmail.com> wrote: > Have you tried changing the configured JMX port? After all, it's possible > the conflict is between kafka and some other software running on the same > server. > > On 28 June 2017 at 21:06, Eric Coan <ec...@instructure.com> wrote: > > > Hello, > > > > > > Unfortunately Kafka does indeed startup and run for a little bit before > > crashing with the above exception, so doing one simple check wouldn't > work. > > I could theoretically keep this script running forever, and constantly > > checking for it being up. However that's really a hacky solution, and I'd > > prefer to not do that if I don't have too. > > > > On Wed, Jun 28, 2017 at 1:43 PM, M. Manna <manme...@gmail.com> wrote: > > > > > Can you not put a service wrapper for startup? It will attempt a > restart > > if > > > the executable isn't up and running successfully. > > > > > > I am not familiar with Unix side, but in Windows you can use a > powershell > > > to utilise such thing. It's a better approach. > > > > > > Let me know what you think. > > > > > > On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote: > > > > > > > I am using the same configuration for all brokers. However, each > broker > > > is > > > > running on a completely separate host (I'm not running all three > > brokers > > > on > > > > the same host). I can get all three running if I manually start kafka > > > > again, however it's just occasionally on boot one fails to start with > > > this > > > > error. > > > > > > > > On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <manme...@gmail.com> > wrote: > > > > > > > > > Aren't u using the same JMX port 9999 for all brokers? I dont think > > it > > > > will > > > > > work for more than 1 broker. > > > > > > > > > > > > > > > > > > > > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote: > > > > > > > > > > > Hey, > > > > > > > > > > > > No worries. I'm starting the brokers with a script yes (that ends > > up > > > > > > generating the command I pasted: > > > > > > > > > > > > ``` > > > > > > > > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true > > > > > > -Dcom.sun.management.jmxremote.authenticate=false > > > > > > -Dcom.sun.management.jmxremote.ssl=false > > > -Djava.rmi.server.hostname=$ > > > > > FQDN > > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 > > SCALA_VERSION=2.12.2 > > > > > > JAVA_HOME=/usr > > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon > > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override > > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id > > ="$broker_id" > > > > > > --override > > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ" > > > > > > ``` > > > > > > > > > > > > The script beforehand populates the variables such as the FQDN, > the > > > > > broker > > > > > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The > > > important > > > > > > part of the command really is: > > > > > > > > > > > > ``` > > > > > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 > > > JAVA_HOME=/usr > > > > > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon .. > > > > > > ``` > > > > > > > > > > > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <manme...@gmail.com> > > > wrote: > > > > > > > > > > > > > Please forgive my autocorrect options :( > > > > > > > > > > > > > > On 28 Jun 2017 8:06 pm, "M. Manna" <manme...@gmail.com> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > OS is not an issue, I have a 3 broker setup and I have > > experienced > > > > this > > > > > > > too. > > > > > > > > > > > > > > How are toy atarting the brokers? Is this a concurrent start or > > > have > > > > > you > > > > > > > got some startup scriptto bring up all the brokers? > > > > > > > > > > > > > > KR, > > > > > > > > > > > > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> > > wrote: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I've recently been doing research into getting our Kafka > > cluster > > > > > > running > > > > > > > > outside of Mesos (for a couple of reasons). However I'm > > noticing > > > > > about > > > > > > > 10% > > > > > > > > of the time Kafka fails to start on boot (or more accurately > > > > starts, > > > > > > and > > > > > > > > immediately exits). I find it weird since all brokers are > using > > > the > > > > > > exact > > > > > > > > same configuration, on the same OS (Ubuntu 16.04) > > > > > > > > > > > > > > > > There's nothing in my LOG4J directory, however I did find a > > > > singular > > > > > > log > > > > > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the > > actual > > > > > light > > > > > > > as > > > > > > > > to why it's failing: > > > > > > > > > > > > > > > > ``` > > > > > > > > Error: Exception thrown by the agent : java.rmi.server. > > > > > > ExportException: > > > > > > > > Port already in use: 9999; nested exception is: > > > > > > > > java.net.BindException: Address already in use (Bind > > > > failed) > > > > > > > > ``` > > > > > > > > > > > > > > > > However, I can verify nothing is running on this port right > > > before > > > > > > > > invocation using netstat -tulpn which shows: > > > > > > > > > > > > > > > > ``` > > > > > > > > upstart.sh[1127]: Active Internet connections (only servers) > > > > > > > > upstart.sh[1127]: Proto Recv-Q Send-Q Local Address > > > > > Foreign > > > > > > > > Address State PID/Pr > > > > > > > > upstart.sh[1127]: tcp 0 0 127.0.0.1:17123 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1419/p > > > > > > > > upstart.sh[1127]: tcp 0 0 127.0.0.1:8400 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1125/c > > > > > > > > upstart.sh[1127]: tcp 0 0 127.0.0.1:8500 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1125/c > > > > > > > > upstart.sh[1127]: tcp 0 0 0.0.0.0:53 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1215/d > > > > > > > > upstart.sh[1127]: tcp 0 0 0.0.0.0:22 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1111/s > > > > > > > > upstart.sh[1127]: tcp 0 0 127.0.0.1:8600 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1125/c > > > > > > > > upstart.sh[1127]: tcp 0 0 127.0.0.1:8126 > > > > > > 0.0.0.0:* > > > > > > > > LISTEN 1418/t > > > > > > > > upstart.sh[1127]: tcp6 0 0 :::8301 > > > :::* > > > > > > > > LISTEN 1125/c > > > > > > > > upstart.sh[1127]: tcp6 0 0 :::53 > > > :::* > > > > > > > > LISTEN 1215/d > > > > > > > > upstart.sh[1127]: tcp6 0 0 :::22 > > > :::* > > > > > > > > LISTEN 1111/s > > > > > > > > upstart.sh[1127]: udp 0 0 0.0.0.0:53 > > > > > > 0.0.0.0:* > > > > > > > > 1215/d > > > > > > > > upstart.sh[1127]: udp 0 0 0.0.0.0:68 > > > > > > 0.0.0.0:* > > > > > > > > 973/dh > > > > > > > > upstart.sh[1127]: udp 0 0 10.32.104.144:123 > > > > > > 0.0.0.0:* > > > > > > > > 1341/n > > > > > > > > upstart.sh[1127]: udp 0 0 127.0.0.1:123 > > > > > > 0.0.0.0:* > > > > > > > > 1341/n > > > > > > > > upstart.sh[1127]: udp 0 0 0.0.0.0:123 > > > > > > 0.0.0.0:* > > > > > > > > 1341/n > > > > > > > > upstart.sh[1127]: udp 0 0 127.0.0.1:8600 > > > > > > 0.0.0.0:* > > > > > > > > 1125/c > > > > > > > > upstart.sh[1127]: udp6 0 0 :::54933 > > > :::* > > > > > > > > 1441/j > > > > > > > > upstart.sh[1127]: udp6 0 0 127.0.0.1:8125 > > > > :::* > > > > > > > > 1420/p > > > > > > > > upstart.sh[1127]: udp6 0 0 :::53 > > > :::* > > > > > > > > 1215/d > > > > > > > > upstart.sh[1127]: udp6 0 0 :::8301 > > > :::* > > > > > > > > 1125/c > > > > > > > > upstart.sh[1127]: udp6 0 0 > fe80::898:21ff:fec0:123 > > > :::* > > > > > > > > 1341/n > > > > > > > > upstart.sh[1127]: udp6 0 0 ::1:123 > > > :::* > > > > > > > > 1341/n > > > > > > > > upstart.sh[1127]: udp6 0 0 :::123 > > > :::* > > > > > > > > 1341/n > > > > > > > > ``` > > > > > > > > > > > > > > > > I can also verify the network of the box itself is up, and > > > working > > > > as > > > > > > > > programs like the consul-agent do in fact spawn, and connect > to > > > > their > > > > > > > > clusters before kafka even gets invoked. > > > > > > > > > > > > > > > > For reference I'm using the built in `kafka-server-start.sh` > > > > script, > > > > > > and > > > > > > > > invoking it like so (IPs cut out): > > > > > > > > > > > > > > > > ``` > > > > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false > > > > -Djava.rmi.server.hostname= > > > > > > > > kafka-i-0617a6aaa98f63c21.insops.net > > > > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 > > > > SCALA_VERSION=2.12.2 > > > > > > > > JAVA_HOME=/usr > > > > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon > > > > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override > > > > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override > > > > > > > > broker.id="the-broker-test" --override > > > > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override > > broker.rack="$AZ" > > > > > > > > ``` > > > > > > > > > > > > > > > > I'm not really sure where else to check for problems as it's > > only > > > > > > > happening > > > > > > > > on some boots, and only logging the one line mentioned above. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > *Eric Coan* > > > > > > > > *E: ec...@instructure.com <ec...@instructure.com>* > > > > > > > > *O:* *801.869.5000 <//801.869.5000>* > > > > > > > > <http://instructure.com/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > *Eric Coan* > > > > > > *E: ec...@instructure.com <ec...@instructure.com>* > > > > > > *O:* *801.869.5000 <//801.869.5000>* > > > > > > <http://instructure.com/> > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > *Eric Coan* > > > > *E: ec...@instructure.com <ec...@instructure.com>* > > > > *O:* *801.869.5000 <//801.869.5000>* > > > > <http://instructure.com/> > > > > > > > > > > > > > > > -- > > *Eric Coan* > > *E: ec...@instructure.com <ec...@instructure.com>* > > *O:* *801.869.5000 <//801.869.5000>* > > <http://instructure.com/> > > > -- *Eric Coan* *E: ec...@instructure.com <ec...@instructure.com>* *O:* *801.869.5000 <//801.869.5000>* <http://instructure.com/>