Hi, We're going to deploy a large Cassandra cluster in PB level. Our scenario would be:
1. Lots of writes, about 150 writes/second at average, and about 300K size per write. 2. Relatively very small reads 3. Our data will be never updated 4. But we will delete old data periodically to free space for new data We've learned that compaction strategy would be an important point cause we've ran into 'no space' trouble because of the 'sized tiered' compaction strategy. We've read http://wiki.apache.org/cassandra/LargeDataSetConsiderations and is this enough or update-to-date? From our experience changing any settings/schema during a large cluster is on line and has been running for some time is really really a pain. So we're gathering more info and expecting some more practical suggestions before we set up the cassandra cluster. Thanks and any help is of great appreciation