> I don't understand why it copied data twice. In worst case scenario it > should copy everything (~90G)
Sadly no, repair is currently peer-to-peer based (there is a ticket to fix it: https://issues.apache.org/jira/browse/CASSANDRA-3200, but that's not trivial). This mean that you can end up with RF times the data after a repair. Obviously that should be a worst case scenario as it implies everything is repaired, but at least the triplicate part is a problem, but a know and not so easy to fix one. Is it possible that each time you've ran repair, one of the node in the cluster was very out of sync with the other nodes. Maybe a node that has crashed for a long time? -- Sylvain