Hello again, issue found. I'm installing Kafka by an Ansible playbook. A supplemental module I installed was kafka-http-metrics-reporter-1.1.0.jar from https://github.com/pcross616/kafka-http-metrics-reporter to expose some JMX metrics by HTTP (because monitoring JMX from Nagios is so 1990...).
My Ansible had a typo and the kafka-http-metrics-reporter-1.1.0.jar was named libs/" <<<<< yes, a double quote as filename. So the loading of classpath in /appl/kafka/bin/kafka-run-class.sh loaded quite a bunch of buggy/old classes from this badly named file. As soon as I have removed it everything worked fine. Thanks to Mark for having pointed me at the class-path. Using "-verbose:class" showed the issue. Regards, Charles Le 12.12.19 à 13:54, Charles Bueche a écrit : > Mark, > > thanks for helping. I suspect as well something from the environment but > can't identify any suspect trace. > > I use tarballs from the download pages, but I tried as well to build > trunk and 2.2.2 and both approaches exhibit the issue as well. > > I deploy my Kafka using symlinks, eg /appl/kafka --> > /appl/kafka_2.12-2.2.2, then everything is started using /appl/kafka > > I don't set or see any classpath var. The ps -ef output for 2.2.2 is > listed below. I don't see anything suspect in the jackson part, colored > in bold-red. Is there another place where I should look for parasitic > libs ? Of course older versions are lying around > > I have the feeling that since 2.2 something still references the old > jackson. > > /usr/lib64/jvm/java-1.8.0-openjdk-1.8.0/bin/java -Xmx6g -Xms6g > -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M > -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -server > -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent > -Djava.awt.headless=true -Xloggc:/userlogs/kafka/logs/kafkaServer-gc.log > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation > -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=true > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.password.file=/appl/kafka/config/jmxremote.password > -Dcom.sun.management.jmxremote.access.file=/appl/kafka/config/jmxremote.access > -Dcom.sun.management.jmxremote.port=9999 > -Dkafka.logs.dir=/userlogs/kafka/logs > -Dlog4j.configuration=file:/appl/kafka/bin/../config/log4j.properties > -cp > /appl/kafka/bin/../libs/":/appl/kafka/bin/../libs/activation-1.1.1.jar:/appl/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/appl/kafka/bin/../libs/argparse4j-0.7.0.jar:/appl/kafka/bin/../libs/audience-annotations-0.5.0.jar:/appl/kafka/bin/../libs/commons-lang3-3.8.1.jar:/appl/kafka/bin/../libs/connect-api-2.2.2.jar:/appl/kafka/bin/../libs/connect-basic-auth-extension-2.2.2.jar:/appl/kafka/bin/../libs/connect-file-2.2.2.jar:/appl/kafka/bin/../libs/connect-json-2.2.2.jar:/appl/kafka/bin/../libs/connect-runtime-2.2.2.jar:/appl/kafka/bin/../libs/connect-transforms-2.2.2.jar:/appl/kafka/bin/../libs/guava-20.0.jar:/appl/kafka/bin/../libs/hk2-api-2.5.0-b42.jar:/appl/kafka/bin/../libs/hk2-locator-2.5.0-b42.jar:/appl/kafka/bin/../libs/hk2-utils-2.5.0-b42.jar:*/appl/kafka/bin/../libs/jackson-annotations-2.10.0.jar:/appl/kafka/bin/../libs/jackson-core-2.10.0.jar:/appl/kafka/bin/../libs/jackson-databind-2.10.0.jar:/appl/kafka/bin/../libs/jackson-datatype-jdk8-2.10.0.jar:/appl/kafka/bin/../libs/jackson-jaxrs-base-2.10.0.jar:/appl/kafka/bin/../libs/jackson-jaxrs-json-provider-2.10.0.jar:/appl/kafka/bin/../libs/jackson-module-jaxb-annotations-2.10.0.jar*:/appl/kafka/bin/../libs/jakarta.activation-api-1.2.1.jar:/appl/kafka/bin/../libs/jakarta.xml.bind-api-2.3.2.jar:/appl/kafka/bin/../libs/javassist-3.22.0-CR2.jar:/appl/kafka/bin/../libs/javax.annotation-api-1.2.jar:/appl/kafka/bin/../libs/javax.inject-1.jar:/appl/kafka/bin/../libs/javax.inject-2.5.0-b42.jar:/appl/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/appl/kafka/bin/../libs/javax.ws.rs-api-2.1.1.jar:/appl/kafka/bin/../libs/javax.ws.rs-api-2.1.jar:/appl/kafka/bin/../libs/jaxb-api-2.3.0.jar:/appl/kafka/bin/../libs/jersey-client-2.27.jar:/appl/kafka/bin/../libs/jersey-common-2.27.jar:/appl/kafka/bin/../libs/jersey-container-servlet-2.27.jar:/appl/kafka/bin/../libs/jersey-container-servlet-core-2.27.jar:/appl/kafka/bin/../libs/jersey-hk2-2.27.jar:/appl/kafka/bin/../libs/jersey-media-jaxb-2.27.jar:/appl/kafka/bin/../libs/jersey-server-2.27.jar:/appl/kafka/bin/../libs/jetty-client-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-continuation-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-http-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-io-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-security-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-server-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-servlet-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-servlets-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jetty-util-9.4.14.v20181114.jar:/appl/kafka/bin/../libs/jopt-simple-5.0.4.jar:/appl/kafka/bin/../libs/kafka-clients-2.2.2.jar:/appl/kafka/bin/../libs/kafka-log4j-appender-2.2.2.jar:/appl/kafka/bin/../libs/kafka-streams-2.2.2.jar:/appl/kafka/bin/../libs/kafka-streams-examples-2.2.2.jar:/appl/kafka/bin/../libs/kafka-streams-scala_2.12-2.2.2.jar:/appl/kafka/bin/../libs/kafka-streams-test-utils-2.2.2.jar:/appl/kafka/bin/../libs/kafka-tools-2.2.2.jar:/appl/kafka/bin/../libs/kafka_2.12-2.2.2-sources.jar:/appl/kafka/bin/../libs/kafka_2.12-2.2.2.jar:/appl/kafka/bin/../libs/log4j-1.2.17.jar:/appl/kafka/bin/../libs/lz4-java-1.5.0.jar:/appl/kafka/bin/../libs/maven-artifact-3.6.0.jar:/appl/kafka/bin/../libs/metrics-core-2.2.0.jar:/appl/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/appl/kafka/bin/../libs/plexus-utils-3.1.0.jar:/appl/kafka/bin/../libs/reflections-0.9.11.jar:/appl/kafka/bin/../libs/rocksdbjni-5.15.10.jar:/appl/kafka/bin/../libs/scala-library-2.12.8.jar:/appl/kafka/bin/../libs/scala-logging_2.12-3.9.0.jar:/appl/kafka/bin/../libs/scala-reflect-2.12.8.jar:/appl/kafka/bin/../libs/slf4j-api-1.7.25.jar:/appl/kafka/bin/../libs/slf4j-log4j12-1.7.25.jar:/appl/kafka/bin/../libs/snappy-java-1.1.7.2.jar:/appl/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/appl/kafka/bin/../libs/zkclient-0.11.jar:/appl/kafka/bin/../libs/zookeeper-3.4.13.jar:/appl/kafka/bin/../libs/zstd-jni-1.3.8-1.jar > -Djava.security.auth.login.config=/appl/kafka/config/kafka_server_jaas.conf > kafka.Kafka /appl/kafka/config/server.properties > > > Le 12.12.19 à 13:10, Mark Anderson a écrit : >> Jackson was updated to 2.10 in the latest Kafka release. The method >> mentioned no longer exists in 2.10. >> >> Do you have multiple versions of Jackson on the ckasspath? >> >> On Thu, 12 Dec 2019, 11:09 Charles Bueche, <cbli...@bueche.ch> wrote: >> >>> Hello again, >>> >>> spending hours debugging this and having no clue... >>> >>> * Kafka 2.12-2.1.0 : kafka-reassign-partitions.sh is working fine >>> * Anything newer (2.2.2, 2.3.1, trunk) breaks any operation involving >>> JSON. Stack Dump below. >>> >>> Help please. >>> >>> Charles >>> >>> Le 11.12.19 à 16:58, Charles Bueche a écrit : >>>> Hello again, >>>> >>>> Problem is still present, something is badly broken with the handling of >>>> JSON files. Details : >>>> >>>> I have made multiple downgrades and rolling-upgrades, and the situation >>>> is now : >>>> >>>> * Zookeeper 3.4.14 >>>> * Kafka 2.12-2.3.1 (contains libs/zookeeper-3.4.14.jar) >>>> >>>> The cluster is all "happy", with balanced, no skew, etc. I have >>>> kafka-manager and everything is ok. >>>> >>>> I then create a "testing" topic : >>>> >>>> export ZKLIST="srv1:2181,srv2:2181,srv3:2181" >>>> /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --create >>>> --replication-factor 3 --partitions 1 --topic testing >>>> >>>> I can rise the partitions fine : >>>> >>>> /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --alter --topic >>>> testing --partitions 3 >>>> >>>> It then looks ok : >>>> >>>> /appl/kafka/bin/kafka-topics.sh --zookeeper $ZKLIST --describe >>>> --topic testing >>>> Topic:testing PartitionCount:3 ReplicationFactor:3 >>> Configs: >>>> Topic: testing Partition: 0 Leader: 1 Replicas: 1,2,3 >>>> Isr: 1,2,3 >>>> Topic: testing Partition: 1 Leader: 2 Replicas: 2,3,1 >>>> Isr: 2,3,1 >>>> Topic: testing Partition: 2 Leader: 3 Replicas: 3,1,2 >>>> Isr: 3,1,2 >>>> >>>> Attempting any operation involving a json file breaks, when it worked >>>> fine on Kafka 2.12-2.1.0. I shall precise my json files are passing lint >>>> tests and work ok. >>>> >>>> Helped needed !!! >>>> >>>> Thanks, >>>> Charles >>>> >>>> /appl/kafka/bin/kafka-reassign-partitions.sh --zookeeper $ZKLIST >>>> --reassignment-json-file ~buec/tmp/r7.json --execute >>>> Current partition replica assignment >>>> >>>> >>> {"version":1,"partitions":[{"topic":"testing","partition":0,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"testing","partition":2,"replicas":[3,1,2],"log_dirs":["any","any","any"]},{"topic":"testing","partition":1,"replicas":[2,3,1],"log_dirs":["any","any","any"]}]} >>>> Save this to use as the --reassignment-json-file option during rollback >>>> Partitions reassignment failed due to >>>> >>> com.fasterxml.jackson.databind.ext.Java7Support.getSerializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonSerializer; >>>> java.lang.AbstractMethodError: >>>> >>> com.fasterxml.jackson.databind.ext.Java7Support.getSerializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonSerializer; >>>> at >>>> >>> com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findSerializer(OptionalHandlerFactory.java:92) >>>> at >>>> >>> com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findOptionalStdSerializer(BasicSerializerFactory.java:439) >>>> at >>>> >>> com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByPrimaryType(BasicSerializerFactory.java:374) >>>> at >>>> >>> com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:226) >>>> at >>>> >>> com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:165) >>>> at >>>> >>> com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1385) >>>> at >>>> >>> com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1336) >>>> at >>>> >>> com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:510) >>>> at >>>> >>> com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:713) >>>> at >>>> >>> com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:308) >>>> at >>>> >>> com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3893) >>>> at >>>> >>> com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3231) >>>> at kafka.utils.Json$.encodeAsBytes(Json.scala:115) >>>> at kafka.zk.ReassignPartitionsZNode$.encode(ZkData.scala:403) >>>> at >>>> >>> kafka.zk.KafkaZkClient.createPartitionReassignment(KafkaZkClient.scala:843) >>>> at >>>> >>> kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:635) >>>> at >>>> >>> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:221) >>>> at >>>> >>> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:205) >>>> at >>>> >>> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:65) >>>> at >>>> >>> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) >>>> Le 05.12.19 à 17:33, Ismael Juma a écrit : >>>>> Are you using ZooKeeper 3.5.6 client libraries with Kafka 2.3.1? Kafka >>>>> 2.3.1 ships with ZooKeeper 3.4.x. >>>>> >>>>> Ismael >>>>> >>>>> On Thu, Dec 5, 2019 at 8:18 AM Charles Bueche <cbue...@gmail.com> >>> wrote: >>>>>> Hello all, >>>>>> >>>>>> I do have two Kafka clusters in action, test and prod. The two are >>> formed >>>>>> by 3 nodes each, are independent and run their own zookeeper setups. My >>>>>> prod cluster is running fine. My test cluster is half-broken and I'm >>>>>> struggling to fix it. I could wipe it but I prefer to understand what's >>>>>> wrong and fix it. >>>>>> >>>>>> I'm not sure what broke my test cluster. I had several network >>>>>> disconnections / split-brains but Kafka always recovered fine. The >>> reasons >>>>>> for the network issues are independent and still being investigated >>>>>> (layer-2 storms, etc). >>>>>> >>>>>> So I upgraded my zookeeper and kafka to the latest versions and when >>>>>> trying to rebalance a topic across brokers I started to notice the >>>>>> problems. Not sure really when they started, before or after the >>> upgrade. >>>>>> I ran the upgrade as for the official doc (rolling upgrade, moving up >>>>>> inter.broker.protocol.version and log.message.format.version >>> gradually). >>>>>> ------------------------------------------------------ >>>>>> cat /etc/os-release >>>>>> NAME="SLES" >>>>>> VERSION="12-SP4" >>>>>> VERSION_ID="12.4" >>>>>> PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4" >>>>>> ID="sles" >>>>>> ANSI_COLOR="0;32" >>>>>> CPE_NAME="cpe:/o:suse:sles:12:sp4" >>>>>> >>>>>> ------------------------------------------------------ >>>>>> versions : >>>>>> - zookeeper 3.5.6 >>>>>> - kafka 2.12-2.3.1 >>>>>> >>>>>> ------------------------------------------------------ >>>>>> >>>>>> many rapid log entries on brokers 1 & 2 (we have 1, 2, 3) >>>>>> >>>>>> [2019-12-05 09:56:54,967] ERROR [KafkaApi-1] Number of alive brokers >>> '0' >>>>>> does not meet the required replication factor '1' for the offsets topic >>>>>> (configured via 'offsets.topic.replication.factor'). This error can be >>>>>> ignored if the cluster is starting up and not all brokers are up yet. >>>>>> (kafka.server.KafkaApis) >>>>>> >>>>>> ------------------------------------------------------ >>>>>> java is java-1_8_0-openjdk-1.8.0.222-27.35.2.x86_64 from SLES. I have >>>>>> tried Oracle Java jdk1.8.0_231 with the same issue. >>>>>> >>>>>> ------------------------------------------------------ >>>>>> when trying to see a reassignment I have this very suspect error : >>>>>> >>>>>> root@vmgato701a01:/appl/kafka/bin # ./kafka-reassign-partitions.sh >>>>>> --zookeeper $ZKLIST --reassignment-json-file /tmp/r7.json --verify >>>>>> >>>>>> Status of partition reassignment: >>>>>> Partitions reassignment failed due to >>>>>> >>> com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer; >>>>>> java.lang.AbstractMethodError: >>>>>> >>> com.fasterxml.jackson.databind.ext.Java7Support.getDeserializerForJavaNioFilePath(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/JsonDeserializer; >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.ext.OptionalHandlerFactory.findDeserializer(OptionalHandlerFactory.java:122) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findOptionalStdDeserializer(BasicDeserializerFactory.java:1589) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findDefaultDeserializer(BasicDeserializerFactory.java:1812) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.findStdDeserializer(BeanDeserializerFactory.java:161) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:125) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:411) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:349) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:264) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:477) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:4178) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3997) >>>>>> at >>>>>> >>> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079) >>>>>> at kafka.utils.Json$.parseBytesAs(Json.scala:73) >>>>>> at kafka.zk.ReassignPartitionsZNode$.decode(ZkData.scala:407) >>>>>> at >>>>>> >>> kafka.zk.KafkaZkClient.getPartitionReassignment(KafkaZkClient.scala:795) >>>>>> at >>>>>> >>> kafka.admin.ReassignPartitionsCommand$.checkIfPartitionReassignmentSucceeded(ReassignPartitionsCommand.scala:355) >>>>>> at >>>>>> >>> kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:97) >>>>>> at >>>>>> >>> kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:90) >>>>>> at >>>>>> >>> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:61) >>>>>> at >>>>>> >>> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) >>>>>> Help appreciated. >>>>>> >>> -- >>> Charles Bueche <cbli...@bueche.ch> >>> >>> -- Charles Bueche <cbli...@bueche.ch>