With the `zookeeper-shell.sh` script, I have checked the path of
`/brokers/ids`, it showed only the broker id which was un-affected.

On Fri, Nov 24, 2017 at 12:49 PM, Kamal <kamal.chandraprak...@gmail.com>
wrote:

> Hi Kafka Users,
>
>     In our production cluster, we have faced the below error in 2 out of 3
> brokers. After this error, the ISR are not updated and not able to create
> new topics as the replication factor
> is higher than the available brokers.
>
> The session between Kafka and Zookeeper got expired. During reconnect, the
> below error occurred:
>
> *[2017-11-23 04:48:39,180] INFO Session establishment complete on server
> x.x.x.76/x.x.x.76:10056, sessionid = 0x25fe606b0dd0000, negotiated timeout
> = 20000 (org.apache.zookeeper.ClientCnxn*
> *)*
> *[2017-11-23 04:48:39,181] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)*
> *[2017-11-23 04:48:39,183] INFO re-registering broker info in ZK for
> broker 3 (kafka.server.KafkaHealthcheck$SessionExpireListener)*
> *[2017-11-23 04:48:39,183] INFO Creating /brokers/ids/3 (is it secure?
> false) (kafka.utils.ZKCheckedEphemeral)*
> *[2017-11-23 04:48:39,186] INFO Result of znode creation is: NODEEXISTS
> (kafka.utils.ZKCheckedEphemeral)*
> *[2017-11-23 04:48:39,186] ERROR Error handling event ZkEvent[New session
> event sent to kafka.server.KafkaHealthcheck$SessionExpireListener@58b411d0]
> (org.I0Itec.zkclient.ZkEventThread)*
> *java.lang.RuntimeException: A broker is already registered on the path
> /brokers/ids/3. This probably indicates that you either have configured a
> brokerid that is already in use, or else you have shutdown this broker and
> restarted it faster than the zookeeper timeout so it appears to be
> re-registering.*
> *        at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:408)*
> *        at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:394)*
> *        at
> kafka.server.KafkaHealthcheck.register(KafkaHealthcheck.scala:71)*
> *        at
> kafka.server.KafkaHealthcheck$SessionExpireListener.handleNewSession(KafkaHealthcheck.scala:105)*
> *        at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:736)*
> *        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:72)*
>
>
> After this exception, all the ISR updates gets skipped.
>
> *[2017-11-23 04:48:39,340] INFO New leader is 3
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)*
> *[2017-11-23 04:49:00,008] INFO New leader is 2
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)*
> *[2017-11-23 04:49:12,774] INFO Partition
> [CHANNEL_CLIENT_LISTENER_CHANGE,0] on broker 3: Cached zkVersion [2] not
> equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)*
>
>
> Zookeeper is deployed in quorum mode (3 zk). In one of the Zookeeper, we
> faced the below errors: (kafka-zk default session timeout: 20 s).
> Other two Zk servers seems fine.
>
> https://pastebin.com/9YQABiTL
>
> Finally, we restarted the affected two brokers. We are using Kafka Version
> - 0.10.2.1 and Zookeeper version - 3.4.9
> Does these session errors are fixed in the latest version (1.0.0) / What
> are the pre-cautionary steps to take to avoid these errors ?
>
> Regards,
> Kamal C
>
>
>
>
>
>

Reply via email to