Hello all,

I do have two Kafka clusters in action, test and prod. The two are formed by 3 
nodes each, are independent and run their own zookeeper setups. My prod cluster 
is running fine. My test cluster is half-broken and I'm struggling to fix it. I 
could wipe it but I prefer to understand what's wrong and fix it.

I'm not sure what broke my test cluster. I had several network disconnections / 
split-brains but Kafka always recovered fine. The reasons for the network 
issues are independent and still being investigated (layer-2 storms, etc).

So I upgraded my zookeeper and kafka to the latest versions and when trying to 
rebalance a topic across brokers I started to notice the problems. Not sure 
really when they started, before or after the upgrade.

I ran the upgrade as for the official doc (rolling upgrade, moving up 
inter.broker.protocol.version and log.message.format.version gradually).

------------------------------------------------------
cat /etc/os-release 
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"

------------------------------------------------------
versions :
- zookeeper 3.5.6
- kafka 2.12-2.3.1

------------------------------------------------------

many rapid log entries on brokers 1 & 2 (we have 1, 2, 3)

[2019-12-05 09:56:54,967] ERROR [KafkaApi-1] Number of alive brokers '0' does 
not meet the required replication factor '1' for the offsets topic (configured 
via 'offsets.topic.replication.factor'). This error can be ignored if the 
cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)

------------------------------------------------------
java is java-1_8_0-openjdk-1.8.0.222-27.35.2.x86_64 from SLES. I have tried 
Oracle Java jdk1.8.0_231 with the same issue.

------------------------------------------------------
when trying to see a reassignment I have this very suspect error :

root@vmgato701a01:/appl/kafka/bin # ./kafka-reassign-partitions.sh --zookeeper 
$ZKLIST --reassignment-json-file /tmp/r7.json --verify

Status of partition reassignment: 
Partitions reassignment failed due to 
com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer;
java.lang.AbstractMethodError: 
com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer;
        at 
com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findDeserializer(OptionalHandlerFactory.java:122)
        at 
com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findOptionalStdDeserializer(BasicDeserializerFactory.java:1589)
        at 
com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findDefaultDeserializer(BasicDeserializerFactory.java:1812)
        at 
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.findStdDeserializer(BeanDeserializerFactory.java:161)
        at 
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:125)
        at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:411)
        at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:349)
        at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:264)
        at 
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
        at 
com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
        at 
com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:477)
        at 
com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:4178)
        at 
com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3997)
        at 
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079)
        at kafka.utils.Json$.parseBytesAs(Json.scala:73)
        at kafka.zk.ReassignPartitionsZNode$.decode(ZkData.scala:407)
        at 
kafka.zk.KafkaZkClient.getPartitionReassignment(KafkaZkClient.scala:795)
        at 
kafka.admin.ReassignPartitionsCommand$.checkIfPartitionReassignmentSucceeded(ReassignPartitionsCommand.scala:355)
        at 
kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:97)
        at 
kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:90)
        at 
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:61)
        at 
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)

Help appreciated.

Reply via email to