Hey all, For legacy reasons we're living with Cassandra 2.0.10 in an RF=1 setup. This is being moved away from ASAP. In the meantime, adding a node recently encountered a Stream Failed error (http://pastie.org/9725846). Cassandra restarted and it seemingly restarted streaming from zero, without having removed the failed stream's data.
With bootstrapping and initial compactions finished that node now has what seems to be duplicate data, with almost exactly 2x the expected disk usage. CQL returns correct results but we depend on the ability to directly read the SSTable files (hence also RF=1.) Would anyone have suggestions on a good way to resolve this? Thanks, Alain