Hi, this is an issue we have a faced a couple times now. Every ones in a while Opscenter throws an error that repair service failed die to errors. In the logs we can see multiple lines like:
Repair task (<Node nodename='-5517036565151358111'>, (-6964720218971987043L, -6963882488374905088L), set([tables])) timed out after 3600 seconds. manually running "nodetool repair -pr" on that node just hangs there and doesn't do anything. Once we restart dse, the repair job starts fine. Any ideas? Thanks