Hi, Using Cassandra 1.2.10, I am trying to load sstable data into a cluster of 6 machines. The machines are using vnodes, and are configured with NetworkTopologyStrategy replication=3 and LeveledCompactionStrategy on the tables being loaded. The sstable data was generated using SSTableSimpleUnsortedWriter. The small dataset for one table is ~100GB, the large dataset for another table is ~500GB.
The data was loaded using: sstableloader --nodes ihz58,ihz59,ihz60,ihz61,ihz62,ihz63 --verbose ${sstable_dir} and was run on a machine that was not part of the cluster. After loading the data using sstableloader, I discovered that some rows were missing from Cassandra. I dumped the sstables using sstable2json and could see the missing rows in the generated data. Over time the list of missing rows reduced, but for several days now the list of missing data has not changed. It is now more than a week since I first loaded the data. I have tried flushing all the nodes, restarting all machines, and running a repair, but nothing changes the set of missing rows. Is there anything I have done wrong here that could result in lost data? Thanks, Ross