Hello again, Problem is still present, something is badly broken with the handling of JSON files. Details :
I have made multiple downgrades and rolling-upgrades, and the situation is now : * Zookeeper 3.4.14 * Kafka 2.12-2.3.1 (contains libs/zookeeper-3.4.14.jar) The cluster is all "happy", with balanced, no skew, etc. I have kafka-manager and everything is ok. I then create a "testing" topic : export ZKLIST="srv1:2181,srv2:2181,srv3:2181" /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --create --replication-factor 3 --partitions 1 --topic testing I can rise the partitions fine : /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --alter --topic testing --partitions 3 It then looks ok : /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --describe --topic testing Topic:testing PartitionCount:3 ReplicationFactor:3 Configs: Topic: testing Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3 Topic: testing Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1 Topic: testing Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2 Attempting any operation involving a json file breaks, when it worked fine on Kafka 2.12-2.1.0. I shall precise my json files are passing lint tests and work ok. Helped needed !!! Thanks, Charles /appl/kafka/bin/kafka-reassign-partitions.sh --zookeeper $ZKLIST --reassignment-json-file ~buec/tmp/r7.json --execute Current partition replica assignment {"version":1,"partitions":[{"topic":"testing","partition":0,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"testing","partition":2,"replicas":[3,1,2],"log_dirs":["any","any","any"]},{"topic":"testing","partition":1,"replicas":[2,3,1],"log_dirs":["any","any","any"]}]} Save this to use as the --reassignment-json-file option during rollback Partitions reassignment failed due to com.fasterxml.jackson.databind.ext.Java7Support.getSerializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonSerializer; java.lang.AbstractMethodError: com.fasterxml.jackson.databind.ext.Java7Support.getSerializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonSerializer; at com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findSerializer(OptionalHandlerFactory.java:92) at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findOptionalStdSerializer(BasicSerializerFactory.java:439) at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByPrimaryType(BasicSerializerFactory.java:374) at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:226) at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:165) at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1385) at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1336) at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:510) at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:713) at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:308) at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3893) at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3231) at kafka.utils.Json$.encodeAsBytes(Json.scala:115) at kafka.zk.ReassignPartitionsZNode$.encode(ZkData.scala:403) at kafka.zk.KafkaZkClient.createPartitionReassignment(KafkaZkClient.scala:843) at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:635) at kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:221) at kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:205) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:65) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Le 05.12.19 à 17:33, Ismael Juma a écrit : > Are you using ZooKeeper 3.5.6 client libraries with Kafka 2.3.1? Kafka > 2.3.1 ships with ZooKeeper 3.4.x. > > Ismael > > On Thu, Dec 5, 2019 at 8:18 AM Charles Bueche <cbue...@gmail.com> wrote: > >> Hello all, >> >> I do have two Kafka clusters in action, test and prod. The two are formed >> by 3 nodes each, are independent and run their own zookeeper setups. My >> prod cluster is running fine. My test cluster is half-broken and I'm >> struggling to fix it. I could wipe it but I prefer to understand what's >> wrong and fix it. >> >> I'm not sure what broke my test cluster. I had several network >> disconnections / split-brains but Kafka always recovered fine. The reasons >> for the network issues are independent and still being investigated >> (layer-2 storms, etc). >> >> So I upgraded my zookeeper and kafka to the latest versions and when >> trying to rebalance a topic across brokers I started to notice the >> problems. Not sure really when they started, before or after the upgrade. >> >> I ran the upgrade as for the official doc (rolling upgrade, moving up >> inter.broker.protocol.version and log.message.format.version gradually). >> >> ------------------------------------------------------ >> cat /etc/os-release >> NAME="SLES" >> VERSION="12-SP4" >> VERSION_ID="12.4" >> PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4" >> ID="sles" >> ANSI_COLOR="0;32" >> CPE_NAME="cpe:/o:suse:sles:12:sp4" >> >> ------------------------------------------------------ >> versions : >> - zookeeper 3.5.6 >> - kafka 2.12-2.3.1 >> >> ------------------------------------------------------ >> >> many rapid log entries on brokers 1 & 2 (we have 1, 2, 3) >> >> [2019-12-05 09:56:54,967] ERROR [KafkaApi-1] Number of alive brokers '0' >> does not meet the required replication factor '1' for the offsets topic >> (configured via 'offsets.topic.replication.factor'). This error can be >> ignored if the cluster is starting up and not all brokers are up yet. >> (kafka.server.KafkaApis) >> >> ------------------------------------------------------ >> java is java-1_8_0-openjdk-1.8.0.222-27.35.2.x86_64 from SLES. I have >> tried Oracle Java jdk1.8.0_231 with the same issue. >> >> ------------------------------------------------------ >> when trying to see a reassignment I have this very suspect error : >> >> root@vmgato701a01:/appl/kafka/bin # ./kafka-reassign-partitions.sh >> --zookeeper $ZKLIST --reassignment-json-file /tmp/r7.json --verify >> >> Status of partition reassignment: >> Partitions reassignment failed due to >> com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer; >> java.lang.AbstractMethodError: >> com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer; >> at >> com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findDeserializer(OptionalHandlerFactory.java:122) >> at >> com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findOptionalStdDeserializer(BasicDeserializerFactory.java:1589) >> at >> com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findDefaultDeserializer(BasicDeserializerFactory.java:1812) >> at >> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.findStdDeserializer(BeanDeserializerFactory.java:161) >> at >> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:125) >> at >> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:411) >> at >> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:349) >> at >> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:264) >> at >> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244) >> at >> com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142) >> at >> com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:477) >> at >> com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:4178) >> at >> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3997) >> at >> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079) >> at kafka.utils.Json$.parseBytesAs(Json.scala:73) >> at kafka.zk.ReassignPartitionsZNode$.decode(ZkData.scala:407) >> at >> kafka.zk.KafkaZkClient.getPartitionReassignment(KafkaZkClient.scala:795) >> at >> kafka.admin.ReassignPartitionsCommand$.checkIfPartitionReassignmentSucceeded(ReassignPartitionsCommand.scala:355) >> at >> kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:97) >> at >> kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:90) >> at >> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:61) >> at >> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) >> >> Help appreciated. >> -- Charles Bueche <cbli...@bueche.ch>