Hi, Bumping up the discussion thread on KIP-501 about avoiding out-of-sync or offline partitions when follower fetch requests are not processed in time by the leader replica. This issue occurred several times in multiple production environments (at Uber, Yelp, Twitter, etc).
KIP-501 is located here <https://cwiki.apache.org/confluence/display/KAFKA/KIP-501+Avoid+out-of-sync+or+offline+partitions+when+follower+fetch+requests+are+not+processed+in+time>. You may want to look at the earlier mail discussion thread here <https://mail-archives.apache.org/mod_mbox/kafka-dev/202002.mbox/%3Cpony-9f4e96e457398374499ab892281453dcaa7dc679-11722f366b06d9f46bcb5905ff94fd6ab167598e%40dev.kafka.apache.org%3E>, and here <https://mail-archives.apache.org/mod_mbox/kafka-dev/202002.mbox/%3CCAM-aUZnJ4z%2B_ztjF6sXSL61M1me0ogWZ1BV6%2BoV45rJMG8EoZA%40mail.gmail.com%3E> . Please take a look, I would like to hear your feedback and suggestions. Thanks, Satish.