Hello,

Unfortunately Kafka does indeed startup and run for a little bit before
crashing with the above exception, so doing one simple check wouldn't work.
I could theoretically keep this script running forever, and constantly
checking for it being up. However that's really a hacky solution, and I'd
prefer to not do that if I don't have too.

On Wed, Jun 28, 2017 at 1:43 PM, M. Manna <manme...@gmail.com> wrote:

> Can you not put a service wrapper for startup? It will attempt a restart if
> the executable isn't up and running successfully.
>
> I am not familiar with Unix side, but in Windows you can use a powershell
> to utilise such thing. It's a better approach.
>
> Let me know what you think.
>
> On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:
>
> > I am using the same configuration for all brokers. However, each broker
> is
> > running on a completely separate host (I'm not running all three brokers
> on
> > the same host). I can get all three running if I manually start kafka
> > again, however it's just occasionally on boot one fails to start with
> this
> > error.
> >
> > On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <manme...@gmail.com> wrote:
> >
> > > Aren't u using the same JMX port 9999 for all brokers? I dont think it
> > will
> > > work for more than 1 broker.
> > >
> > >
> > >
> > > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > >
> > > > Hey,
> > > >
> > > > No worries. I'm starting the brokers with a script yes (that ends up
> > > > generating the command I pasted:
> > > >
> > > > ```
> > > >
> > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > -Dcom.sun.management.jmxremote.ssl=false
> -Djava.rmi.server.hostname=$
> > > FQDN
> > > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > > JAVA_HOME=/usr
> > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> > > > --override
> > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > ```
> > > >
> > > > The script beforehand populates the variables such as the FQDN, the
> > > broker
> > > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The
> important
> > > > part of the command really is:
> > > >
> > > > ```
> > > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2
> JAVA_HOME=/usr
> > > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > > ```
> > > >
> > > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <manme...@gmail.com>
> wrote:
> > > >
> > > > > Please forgive my autocorrect options :(
> > > > >
> > > > > On 28 Jun 2017 8:06 pm, "M. Manna" <manme...@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > OS is not an issue, I have a 3 broker setup and I have experienced
> > this
> > > > > too.
> > > > >
> > > > > How are toy atarting the brokers? Is this a concurrent start or
> have
> > > you
> > > > > got some startup scriptto bring up all the brokers?
> > > > >
> > > > > KR,
> > > > >
> > > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've recently been doing research into getting our Kafka cluster
> > > > running
> > > > > > outside of Mesos (for a couple of reasons). However I'm noticing
> > > about
> > > > > 10%
> > > > > > of the time Kafka fails to start on boot (or more accurately
> > starts,
> > > > and
> > > > > > immediately exits). I find it weird since all brokers are using
> the
> > > > exact
> > > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > > >
> > > > > > There's nothing in my LOG4J directory, however I did find a
> > singular
> > > > log
> > > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual
> > > light
> > > > > as
> > > > > > to why it's failing:
> > > > > >
> > > > > > ```
> > > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > > ExportException:
> > > > > > Port already in use: 9999; nested exception is:
> > > > > >         java.net.BindException: Address already in use (Bind
> > failed)
> > > > > > ```
> > > > > >
> > > > > > However, I can verify nothing is running on this port right
> before
> > > > > > invocation using netstat -tulpn which shows:
> > > > > >
> > > > > > ```
> > > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> > >  Foreign
> > > > > > Address         State       PID/Pr
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > > >  0.0.0.0:*
> > > > > >            LISTEN      1419/p
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1215/d
> > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1111/s
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1418/t
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::8301
>  :::*
> > > > > >             LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::53
>  :::*
> > > > > >             LISTEN      1215/d
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::22
>  :::*
> > > > > >             LISTEN      1111/s
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > > 0.0.0.0:*
> > > > > >                        1215/d
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > > 0.0.0.0:*
> > > > > >                        973/dh
> > > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > > 0.0.0.0:*
> > > > > >                        1125/c
> > > > > >  upstart.sh[1127]: udp6       0      0 :::54933
> :::*
> > > > > >                         1441/j
> > > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> > :::*
> > > > > >                         1420/p
> > > > > >  upstart.sh[1127]: udp6       0      0 :::53
>  :::*
> > > > > >                         1215/d
> > > > > >  upstart.sh[1127]: udp6       0      0 :::8301
>  :::*
> > > > > >                         1125/c
> > > > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123
> :::*
> > > > > >                         1341/n
> > > > > >  upstart.sh[1127]: udp6       0      0 ::1:123
>  :::*
> > > > > >                         1341/n
> > > > > >  upstart.sh[1127]: udp6       0      0 :::123
> :::*
> > > > > >                         1341/n
> > > > > > ```
> > > > > >
> > > > > > I can also verify the network of the box itself is up, and
> working
> > as
> > > > > > programs like the consul-agent do in fact spawn, and connect to
> > their
> > > > > > clusters before kafka even gets invoked.
> > > > > >
> > > > > > For reference I'm using the built in `kafka-server-start.sh`
> > script,
> > > > and
> > > > > > invoking it like so (IPs cut out):
> > > > > >
> > > > > > ```
> > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > -Djava.rmi.server.hostname=
> > > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> > SCALA_VERSION=2.12.2
> > > > > > JAVA_HOME=/usr
> > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > > broker.id="the-broker-test" --override
> > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > > ```
> > > > > >
> > > > > > I'm not really sure where else to check for problems as it's only
> > > > > happening
> > > > > > on some boots, and only logging the one line mentioned above.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Eric Coan*
> > > > > > *E: ec...@instructure.com <ec...@instructure.com>*
> > > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > > <http://instructure.com/>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Eric Coan*
> > > > *E: ec...@instructure.com <ec...@instructure.com>*
> > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > <http://instructure.com/>
> > > >
> > >
> >
> >
> >
> > --
> > *Eric Coan*
> > *E: ec...@instructure.com <ec...@instructure.com>*
> > *O:* *801.869.5000 <//801.869.5000>*
> > <http://instructure.com/>
> >
>



-- 
*Eric Coan*
*E: ec...@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Reply via email to