Hi, We recently found this issue on one of our 3.0.17 clusters, where the Message Flusher falls off the event loop, eventually resulting in OOM on a bunch of nodes. We saw this happen twice so far.
CASSANDRA-14855 <https://issues.apache.org/jira/browse/CASSANDRA-14855> has all the details (including heap dump analysis). Backported ImmediateFlusher from trunk as a fix for this. Would like to get feedback (on the JIRA) if this fix is recommended, or if there is a suggestion for a better fix. Thanks, Sumanth