Hi, We are testing Kafka cluster with high traffic loads (20000+ messages/second) and we encounter quite frequently issues with brokers dropping persistently from ISR for some partitions. Looking at the code, I noticed in class AbstractFetcherThread , processFetchRequest method, that a KafkaException is thrown, in case some other exception than CorruptRecordException is generated from processPartitionData method. In my understanding, this will cause the fetcher thread to end and thus the replica update it was doing will stop and the broker will be removed from some ISR lists. Couldn't we, in this case also, just log some error message and update 'partitionsWithError' (like it's done in the 'case OFFSET_OUT_OF_RANGE' and the 'case _' branches)?
Ciprian.