Hi community,

I have a connect cluster deployed in a 'cloud-like' environment where an
instance can die anytime but a new instance gets automatically re-spawn
immediately after(within the min...). This obviously leads to an eager
rebalance - I am on Kafka 2.1.1 for client and server.

What happens after rebalance is completed is my connector is running full
speed, all partitions are being consumed, no lag - But:

- now some of my tasks are carrying an extra partition.
- some of the tasks(the same number as above) fail with this error because
of this error.

{"id":28,"state":"FAILED","worker_id":"hostname.zyx.com:310563","trace":"java.lang.NullPointerException\n\tat
org.apache.kafka.common.config.AbstractConfig.propsToMap(AbstractConfig.java:98)\n\tat
org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:59)\n\tat
org.apache.kafka.connect.runtime.TaskConfig.<init>(TaskConfig.java:51)\n\tat
org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:418)\n\tat
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)\n\tat
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)\n\tat
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)\n\tat
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)\n\tat

Basically, it looks like the partition reassignment has re-assigned those
some partitions to 2 different workers.

Any pointers as to what direction to look or what might have happened is
appreciated.

Kafka version: 2.1.1

Thank you,
-- 
Abdoulaye Diallo

Reply via email to