Hello, I have had problems running my Samza application on the cluster. The application starts fine and so the main event loop. As soon I start to send messages to Kafka, Samza doesn’t start the Kafka system consumer (there are no logs that state that). The CPU usage for all containers is about 100% even if I stop producers. It is like the container is stuck and can’t start the consumer. However Samza can set the offsets for different partitions. For example in a container I see:
o.a.samza.system.kafka.GetOffset - Able to successfully read from offset 0 for topic and partition [test_topic,40]. Using it to instantiate consumer. o.a.samza.system.kafka.BrokerProxy - Starting BrokerProxy for node1.cluster.com:9092 o.a.samza.system.kafka.GetOffset - Validating offset 0 for topic and partition [test_topic,19] o.a.samza.system.kafka.GetOffset - Able to successfully read from offset 0 for topic and partition [test_topic,19]. Using it to instantiate consumer. o.a.samza.container.SamzaContainer - Entering run loop. Here is the configuration I use: { yarn.container.count=24, systems.kafka.samza.key.serde=int, systems.kafka.consumer.zookeeper.connect=localhost:2181/, serializers.registry.int.class=org.apache.samza.serializers.StringSerdeFactory, systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory, task.drop.deserialization.errors=true, yarn.container.memory.mb=1024, task.inputs=kafka.test_topic, job.factory.class=org.apache.samza.job.yarn.YarnJobFactory, yarn.package.path=hdfs://node1.cluster.com:8020/my-app-0.0.1-dist.tar.gz, task.class=com.company.test.Task, systems.kafka.samza.msg.serde=json, job.name=test, serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory, systems.kafka.producer.bootstrap.servers=node1.cluster.com:9092,node2.cluster.com:9092,node3.cluster.com:9092, } The application was working before a server crash. I tried to clean all Zookeeper data and restart everything. Do you have any idea why the consumer doesn’t work? Regards Davide