Thank you Aaron, but we're planning about 20T per node, is that feasible?
On Mon, May 12, 2014 at 4:33 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > We've learned that compaction strategy would be an important point cause > we've ran into 'no space' trouble because of the 'sized tiered' compaction > strategy. > > If you want to get the most out of the raw disk space LCS is the way to > go, remember it uses approximately twice the disk IO. > > From our experience changing any settings/schema during a large cluster is > on line and has been running for some time is really really a pain. > > Which parts in particular ? > > Updating the schema or config ? OpsCentre has a rolling restart feature > which can be handy when chef / puppet is deploying the config changes. > Schema / gossip can take a little to propagate with high number of nodes. > > On a modern version you should be able to run 2 to 3 TB per node, maybe > higher. The biggest concerns are going to be repair (the changes in 2.1 > will help) and bootstrapping. I’d recommend testing a smaller cluster, say > 12 nodes, with a high load per node 3TB. > > cheers > Aaron > > ----------------- > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 9/05/2014, at 12:09 pm, Yatong Zhang <bluefl...@gmail.com> wrote: > > Hi, > > We're going to deploy a large Cassandra cluster in PB level. Our scenario > would be: > > 1. Lots of writes, about 150 writes/second at average, and about 300K size > per write. > 2. Relatively very small reads > 3. Our data will be never updated > 4. But we will delete old data periodically to free space for new data > > We've learned that compaction strategy would be an important point cause > we've ran into 'no space' trouble because of the 'sized tiered' compaction > strategy. > > We've read http://wiki.apache.org/cassandra/LargeDataSetConsiderationsand is > this enough or update-to-date? From our experience changing any > settings/schema during a large cluster is on line and has been running for > some time is really really a pain. So we're gathering more info and > expecting some more practical suggestions before we set up the cassandra > cluster. > > Thanks and any help is of great appreciation > > >