Hi Jacob,
That could be a reason, but what about just a kernel failure or whatever
other reason? My question was not to determine the best environment to run,
but whether it would be possible to fail fast should this type of issues
pop up.
Regards.
On June 8, 2018 7:43:11 PM Jacob Sheck <shec0...@gmail.com> wrote:
What do you mean by "The issue appears when one of the brokers starts
being impacted
by environmental issues within the server it's running into (for whatever
reason)"?
You should consider Kafka to be a first tier service, it shouldn't be
deployed on shared resources. There are a lot of opinions about
containers, VMs, and bare metal, but regardless your kafka brokers should
be isolated so they don't become resource starved.
On Fri, Jun 8, 2018 at 7:52 AM Enrique Medina Montenegro <
e.medin...@gmail.com> wrote:
Hi,
I was wondering if there is a proper way or best practices to fail fast a
broker when it's unresponsive (think about the server it's running on has
issues). Let me describe the scenario I'm currently facing.
This is a 4 broker cluster using Kafka 1.1 with 5 ZK nodes, everything
running on containers (but could be as well applied to VMs or even bare
metal I believe). The issue appears when one of the brokers starts being
impacted by environmental issues within the server it's running into (for
whatever reason) , and it makes it almost unresponsive, but still "alive
enough" to stay in the cluster and be considered by the other brokers.
So you cannot kill the broker (or the container) because the server it runs
into basically times out all the commands, and you're only choice is to
restart or even stop the full server, but due to operational procedures ,
that may take some time.
Therefore, is there any configuration that could be applied for such broker
to be "kicked out" of the cluster even when the broker itself tries still
to be "alive"?
The final consequence is that my cluster is literally down until I manage
to have the server restarted.
Thanks for the support.