There's no easy way to kick out a running broker from the cluster.

If you block that broker's ability to connect to Zookeeper, after
configured timeouts (6 seconds by default I think) you might effectively
get that though.  iptable rules on the ZK hosts, or the brokers, or
whatever hook you have for that.


On Fri, Jun 8, 2018 at 10:52 AM, Enrique Medina Montenegro <
e.medin...@gmail.com> wrote:

> Hi Jacob,
>
> That could be a reason, but what about just a kernel failure or whatever
> other reason? My question was not to determine the best environment to run,
> but whether it would be possible to fail fast should this type of issues
> pop up.
>
> Regards.
>
>
> On June 8, 2018 7:43:11 PM Jacob Sheck <shec0...@gmail.com> wrote:
>
> What do you mean by "The issue appears when one of the brokers starts
>> being impacted
>> by environmental issues within the server it's running into (for whatever
>> reason)"?
>>
>> You should consider Kafka to be a first tier service, it shouldn't be
>> deployed on shared resources.  There are a lot of opinions about
>> containers, VMs, and bare metal, but regardless your kafka brokers should
>> be isolated so they don't become resource starved.
>>
>> On Fri, Jun 8, 2018 at 7:52 AM Enrique Medina Montenegro <
>> e.medin...@gmail.com> wrote:
>>
>> Hi,
>>>
>>> I was wondering if there is a proper way or best practices to fail fast a
>>> broker when it's unresponsive (think about the server it's running on has
>>> issues). Let me describe the scenario I'm currently facing.
>>>
>>> This is a 4 broker cluster using Kafka 1.1 with 5 ZK nodes, everything
>>> running on containers (but could be as well applied to VMs or even bare
>>> metal I believe). The issue appears when one of the brokers starts being
>>> impacted by environmental issues within the server it's running into (for
>>> whatever reason) , and it makes it almost unresponsive, but still "alive
>>> enough" to stay in the cluster and be considered by the other brokers.
>>>
>>> So you cannot kill the broker (or the container) because the server it
>>> runs
>>> into basically times out all the commands, and you're only choice is to
>>> restart or even stop the full server, but due to operational procedures ,
>>> that may take some time.
>>>
>>>
>>> Therefore, is there any configuration that could be applied for such
>>> broker
>>> to be "kicked out" of the cluster even when the broker itself tries still
>>> to be "alive"?
>>>
>>> The final consequence is that my cluster is literally down until I manage
>>> to have the server restarted.
>>>
>>> Thanks for the support.
>>>
>>
>
>
>


-- 

Brett Rann

Senior DevOps Engineer


Zendesk International Ltd

395 Collins Street, Melbourne VIC 3000 Australia

Mobile: +61 (0) 418 826 017

Reply via email to