at the beginning of using cassandra, I have no idea that I should run "node repair" frequently, so basically, I have 3 nodes with RF=3 and have not run node repair for months, the data size is 20G.
the problem is when I start running node repair now, it eat up all disk io and the server load became 20+ and increasing, the worst thing is, the entire cluster has slowed down and can not handle request. so I have to stop it immediately because it make my web service unavailable. the server has Intel Xeon-Lynnfield 3470-Quadcore [2.93GHz] and 8G memory, with Western Digital WD RE3 WD1002FBYS SATA disk. I really have no idea what to do now, as currently I have already found some data loss, any suggestions would be appreciated.