Hello all,
I do have two Kafka clusters in action, test and prod. The two are formed by 3
nodes each, are independent and run their own zookeeper setups. My prod cluster
is running fine. My test cluster is half-broken and I'm struggling to fix it. I
could wipe it but I prefer to understand what's wrong and fix it.
I'm not sure what broke my test cluster. I had several network disconnections /
split-brains but Kafka always recovered fine. The reasons for the network
issues are independent and still being investigated (layer-2 storms, etc).
So I upgraded my zookeeper and kafka to the latest versions and when trying to
rebalance a topic across brokers I started to notice the problems. Not sure
really when they started, before or after the upgrade.
I ran the upgrade as for the official doc (rolling upgrade, moving up
inter.broker.protocol.version and log.message.format.version gradually).
------------------------------------------------------
cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"
------------------------------------------------------
versions :
- zookeeper 3.5.6
- kafka 2.12-2.3.1
------------------------------------------------------
many rapid log entries on brokers 1 & 2 (we have 1, 2, 3)
[2019-12-05 09:56:54,967] ERROR [KafkaApi-1] Number of alive brokers '0' does
not meet the required replication factor '1' for the offsets topic (configured
via 'offsets.topic.replication.factor'). This error can be ignored if the
cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
------------------------------------------------------
java is java-1_8_0-openjdk-1.8.0.222-27.35.2.x86_64 from SLES. I have tried
Oracle Java jdk1.8.0_231 with the same issue.
------------------------------------------------------
when trying to see a reassignment I have this very suspect error :
root@vmgato701a01:/appl/kafka/bin # ./kafka-reassign-partitions.sh --zookeeper
$ZKLIST --reassignment-json-file /tmp/r7.json --verify
Status of partition reassignment:
Partitions reassignment failed due to
com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer;
java.lang.AbstractMethodError:
com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer;
at
com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findDeserializer(OptionalHandlerFactory.java:122)
at
com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findOptionalStdDeserializer(BasicDeserializerFactory.java:1589)
at
com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findDefaultDeserializer(BasicDeserializerFactory.java:1812)
at
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.findStdDeserializer(BeanDeserializerFactory.java:161)
at
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:125)
at
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:411)
at
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:349)
at
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:264)
at
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
at
com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
at
com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:477)
at
com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:4178)
at
com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3997)
at
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079)
at kafka.utils.Json$.parseBytesAs(Json.scala:73)
at kafka.zk.ReassignPartitionsZNode$.decode(ZkData.scala:407)
at
kafka.zk.KafkaZkClient.getPartitionReassignment(KafkaZkClient.scala:795)
at
kafka.admin.ReassignPartitionsCommand$.checkIfPartitionReassignmentSucceeded(ReassignPartitionsCommand.scala:355)
at
kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:97)
at
kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:90)
at
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:61)
at
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
Help appreciated.