Hi community, I have a connect cluster deployed in a 'cloud-like' environment where an instance can die anytime but a new instance gets automatically re-spawn immediately after(within the min...). This obviously leads to an eager rebalance - I am on Kafka 2.1.1 for client and server.
What happens after rebalance is completed is my connector is running full speed, all partitions are being consumed, no lag - But: - now some of my tasks are carrying an extra partition. - some of the tasks(the same number as above) fail with this error because of this error. {"id":28,"state":"FAILED","worker_id":"hostname.zyx.com:310563","trace":"java.lang.NullPointerException\n\tat org.apache.kafka.common.config.AbstractConfig.propsToMap(AbstractConfig.java:98)\n\tat org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:59)\n\tat org.apache.kafka.connect.runtime.TaskConfig.<init>(TaskConfig.java:51)\n\tat org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:418)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)\n\tat Basically, it looks like the partition reassignment has re-assigned those some partitions to 2 different workers. Any pointers as to what direction to look or what might have happened is appreciated. Kafka version: 2.1.1 Thank you, -- Abdoulaye Diallo