Internode messages which are received by a node, but do not get not to be processed within rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting for a response. If the Coordinator node does not receive Consistency Level responses before the rpc_timeout it will return a TimedOutException to the client.
I understand that, but that’s where this makes no sense. I’m running with RF=1, and CL=QUORUM, which means each update goes to one node, and I need one response for a success. I have many thousands of dropped mutation messages, but no TimedOutExceptions thrown back to the client. If I have GC problems, or other issues that are making my cluster unresponsive, I can deal with that. But having writes that fail and no error is clearly not acceptable. How is it possible to be getting errors and not be informed about them? Thanks Robert