Can you not put a service wrapper for startup? It will attempt a restart if
the executable isn't up and running successfully.

I am not familiar with Unix side, but in Windows you can use a powershell
to utilise such thing. It's a better approach.

Let me know what you think.

On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:

> I am using the same configuration for all brokers. However, each broker is
> running on a completely separate host (I'm not running all three brokers on
> the same host). I can get all three running if I manually start kafka
> again, however it's just occasionally on boot one fails to start with this
> error.
>
> On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <manme...@gmail.com> wrote:
>
> > Aren't u using the same JMX port 9999 for all brokers? I dont think it
> will
> > work for more than 1 broker.
> >
> >
> >
> > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> >
> > > Hey,
> > >
> > > No worries. I'm starting the brokers with a script yes (that ends up
> > > generating the command I pasted:
> > >
> > > ```
> > >
> > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > -Dcom.sun.management.jmxremote.authenticate=false
> > > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$
> > FQDN
> > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > JAVA_HOME=/usr
> > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> > > --override
> > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > ```
> > >
> > > The script beforehand populates the variables such as the FQDN, the
> > broker
> > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The important
> > > part of the command really is:
> > >
> > > ```
> > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 JAVA_HOME=/usr
> > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > ```
> > >
> > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <manme...@gmail.com> wrote:
> > >
> > > > Please forgive my autocorrect options :(
> > > >
> > > > On 28 Jun 2017 8:06 pm, "M. Manna" <manme...@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > OS is not an issue, I have a 3 broker setup and I have experienced
> this
> > > > too.
> > > >
> > > > How are toy atarting the brokers? Is this a concurrent start or have
> > you
> > > > got some startup scriptto bring up all the brokers?
> > > >
> > > > KR,
> > > >
> > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I've recently been doing research into getting our Kafka cluster
> > > running
> > > > > outside of Mesos (for a couple of reasons). However I'm noticing
> > about
> > > > 10%
> > > > > of the time Kafka fails to start on boot (or more accurately
> starts,
> > > and
> > > > > immediately exits). I find it weird since all brokers are using the
> > > exact
> > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > >
> > > > > There's nothing in my LOG4J directory, however I did find a
> singular
> > > log
> > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual
> > light
> > > > as
> > > > > to why it's failing:
> > > > >
> > > > > ```
> > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > ExportException:
> > > > > Port already in use: 9999; nested exception is:
> > > > >         java.net.BindException: Address already in use (Bind
> failed)
> > > > > ```
> > > > >
> > > > > However, I can verify nothing is running on this port right before
> > > > > invocation using netstat -tulpn which shows:
> > > > >
> > > > > ```
> > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> >  Foreign
> > > > > Address         State       PID/Pr
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > >  0.0.0.0:*
> > > > >            LISTEN      1419/p
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > 0.0.0.0:*
> > > > >            LISTEN      1215/d
> > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > 0.0.0.0:*
> > > > >            LISTEN      1111/s
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > 0.0.0.0:*
> > > > >            LISTEN      1418/t
> > > > >  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
> > > > >             LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
> > > > >             LISTEN      1215/d
> > > > >  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
> > > > >             LISTEN      1111/s
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > 0.0.0.0:*
> > > > >                        1215/d
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > 0.0.0.0:*
> > > > >                        973/dh
> > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > 0.0.0.0:*
> > > > >                        1125/c
> > > > >  upstart.sh[1127]: udp6       0      0 :::54933                :::*
> > > > >                         1441/j
> > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> :::*
> > > > >                         1420/p
> > > > >  upstart.sh[1127]: udp6       0      0 :::53                   :::*
> > > > >                         1215/d
> > > > >  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
> > > > >                         1125/c
> > > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
> > > > >                         1341/n
> > > > >  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
> > > > >                         1341/n
> > > > >  upstart.sh[1127]: udp6       0      0 :::123                  :::*
> > > > >                         1341/n
> > > > > ```
> > > > >
> > > > > I can also verify the network of the box itself is up, and working
> as
> > > > > programs like the consul-agent do in fact spawn, and connect to
> their
> > > > > clusters before kafka even gets invoked.
> > > > >
> > > > > For reference I'm using the built in `kafka-server-start.sh`
> script,
> > > and
> > > > > invoking it like so (IPs cut out):
> > > > >
> > > > > ```
> > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > -Dcom.sun.management.jmxremote.ssl=false
> -Djava.rmi.server.hostname=
> > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> SCALA_VERSION=2.12.2
> > > > > JAVA_HOME=/usr
> > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > broker.id="the-broker-test" --override
> > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > ```
> > > > >
> > > > > I'm not really sure where else to check for problems as it's only
> > > > happening
> > > > > on some boots, and only logging the one line mentioned above.
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > --
> > > > > *Eric Coan*
> > > > > *E: ec...@instructure.com <ec...@instructure.com>*
> > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > <http://instructure.com/>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Eric Coan*
> > > *E: ec...@instructure.com <ec...@instructure.com>*
> > > *O:* *801.869.5000 <//801.869.5000>*
> > > <http://instructure.com/>
> > >
> >
>
>
>
> --
> *Eric Coan*
> *E: ec...@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>

Reply via email to