Hello! I encountered the following problem. I have 3 brokers and 1 zookeper. Topics with 10 partitions and replication factor 3. Stream app with 10 threads, exactly_once and commit interval 1000ms. When I run stream app, join of 2 my topics doesn't work for specific message. But for all another messages system works. If I manually read topics, messages for joining exists. What could cause this problem?
I noticed the following thing. Before writing messages into topic, broker2 fell and was restarted. Interested message is located in 1 partition. I have next information for 1 partition Topic: fc-id-to-delivery-key-lookup Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 0,1,2 Maybe this is due to the fact that partition 1 is the leader but does not exist on the broker2? This can be and how to check it? Or maybe another reason? Also before broker2 was restarted, I saw many errors into logs such as INFO Opening socket connection to server kafka-zookeeper:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) WARN Session 0x1000b358e830002 for server kafka-zookeeper:2181, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) ... ERROR [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Error for partition __transaction_state-12 at offset 0 (kafka.server.ReplicaFetcherThread) "org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. for all topics ... " WARN [SocketServer brokerId=2] Unexpected error from /10.8.1.1; closing connection (org.apache.kafka.common.network.Selector) " After restart one ERROR Could not submit metrics to Kafka topic __confluent.support.metrics: Failed to construct kafka producer (io.confluent.support.metrics.BaseMetricsReporter) " ... many ERROR [Broker id=2] Received LeaderAndIsrRequest with correlation id 1 from controller 1 epoch 1 for partition __consumer_offsets-29 (last update controller epoch 1) but cannot become follower since the new leader -1 is unavailable. (state.change.logger) " .... many "[2019-03-14 12:55:48,399] ERROR [Broker id=2] Received LeaderAndIsrRequest with correlation id 1 from controller 1 epoch 1 for partition flow-control-streams-KSTREAM-OUTEROTHER-0000000039-store-changelog-5 (last update controller epoch 1) but cannot become follower since the new leader -1 is unavailable. (state.change.logger) " Thanks