I'm on Apache Cassandra 3.10. I'm interested in moving over to Reaper for repairs, but in the meantime, I want to get nodetool repair working a little more gracefully.
What I'm noticing is that, when I'm running a repair for the first time with the --full option after a large initial load of data, the client will say it's starting on a repair job and then cease to produce any output for not just minutes but a few hours. This causes SSH inactivity timeouts. I have tried running the repair with the --trace option, but then that leads to the other extreme where there's just a torrent of output, scarcely any of which I'll typically need. As a literal solution to my SSH inactivity timeouts, I could extend the timeouts, or I could do some scripting jujitsu with StrictHostKeyChecking=no and a loop that spits some arbitrary output until the command finishes. But even if the timeouts were no concern, the sheer unresponsiveness is apt to make an operator nervous. And I'd like to think there's a Goldilocks way to run a full nodetool repair on a large dataset where it's just a bit more responsive without going all TMI. Thoughts? Anyone else notice this?