Geliba Uilte created KAFKA-13486: ------------------------------------ Summary: Kafka Connect: Failed to start task due to NPE Key: KAFKA-13486 URL: https://issues.apache.org/jira/browse/KAFKA-13486 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.7.1 Environment: Kubernetes, custom docker image Reporter: Geliba Uilte
I have a Kafka Connect cluster with three workers running on Kubernetes. The workers communicate with each other using pod's IP (internal IP 192.X.X.X). Sometimes, pods are redistributed to different node. I am not sure if it has anything to do with the issue, but I think it makes pod's IP to be changed and Kafka Connect needs to rebalance. Occasionally, tasks fail due to NPE. >From the connectors/:connector/status REST API, I can see this trace: {code:java} at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:517) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1258) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1700(DistributedHerder.java:127) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1273) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1269) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834){code} It looks like the issue is similar to KAFKA-10323 and It seems NPE is thrown from [here|https://github.com/apache/kafka/blob/2.7.1/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L517]. -- This message was sent by Atlassian Jira (v8.20.1#820001)