I'd also say consider what happens during maintenance and failure scenarios. Moving 10's TB around takes a lot longer than 100's GB.
Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Jun 2011, at 06:40, AJ wrote: > Thanks to everyone who responded thus far. > > > On 6/7/2011 10:16 AM, Benjamin Coverston wrote: > <snip> >> Not to say that there aren't workloads where having many TB/Node doesn't >> work, but if you're planning to read from the data you're writing you do >> want to ensure that your working set is stored in memory. >> > > Thank you Ben. Can you elaborate some more on the above point? Are you > referring to the OS's working set or the Cassandra caches? Why exactly do I > need to ensure this? > > I am also wondering if there is any reason I should segregate my frequently > write/read smallish data set (such as usage statistics) from my bulk mostly > read-only data set (static content) into separate CFs if the schema allows > it. Would this be of any benefit?