Hello Team, I am testing scaling of Samza jobs in standalone deployment model. Jobs are deployed in Kubernetes with zookeeper as job coordinator. The input Kafka topic has two partitions and I am running two Samza instances. The configuration supplied to both instances are identical except app.id, which is 1,2 respectively.
The expected result was each Samza instance would process one of the two partitions. In reality both instances subscribes to partition 0 and 1, and I could see duplicate messages in the output topic. Is there anything that I am missing? Samza Version: 1.3.0 Kafka Version: 2.1 Zookeeper Version: 3.4 Configuration: app.id=1 app.name=test app.class=insights.stream.AlertStreamApplication job.default.system=kafka job.coordinator.factory=org.apache.samza.zk.ZkJobCoordinatorFactory task.name.grouper.factory=org.apache.samza.container.grouper.task.GroupByContainerIdsFactory # Task Checkpoint task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory job.coordinator.zk.connect=<ZK IP>:2181 systems.kafka.producer.bootstrap.servers=<Kafka IP>:9092 # Kafka System systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory systems.kafka.default.stream.replication.factor=1 # Input Streams streams.alert.samza.system=kafka streams.alert.samza.physical.name=alert # Output Streams streams.events.samza.system=kafka streams.events.samza.physical.name=events Thanks, Anoop