Then from an IT standpoint, if i'm using a RF of 3, it stands to reason that running on Raid 1 makes sense, since RAID and RF achieve the same ends... it makes sense to strip for speed and let cassandra deal with redundancy, eh?
On Wed, Apr 7, 2010 at 4:07 PM, Benjamin Black <b...@b3k.us> wrote: > On Wed, Apr 7, 2010 at 3:41 PM, banks <bankse...@gmail.com> wrote: > > > > 2. each cassandra node essentially has the same datastore as all nodes, > > correct? > > No. The ReplicationFactor you set determines how many copies of a > piece of data you want. If your number of nodes is higher than your > RF, as is common, you will not have the same data on all nodes. The > exact set of nodes to which data is replicated is determined by the > row key, placement strategy, and node tokens. > > > So if I've got 3 terabytes of data and 3 cassandra nodes I'm > > eating 9tb on the SAN? are there provisions for essentially sharding > across > > nodes... so that each node only handles a given keyrange, if so where is > the > > howto on that? > > > > Sharding is a concept from databases that don't have native > replication and so need a term to describe what they bolt on for the > functionality. Distribution amongst nodes based on key ranges is how > Cassandra always operates. > > > b >