Interesting idea, . If it is like dividing the entire load on the system by 6, so if the effective load is still the same and used SSD's for commit volume we could get away with 1 commitlog SSD. Even if these 6 instances can handle 80% of the load (compared to 1 on this machine), that might be acceptable. Could that help?
I mean the benefits of smaller cassandra nodes does sound very enticing. Sure we would probably have to throw more memory/CPU at it to get comparable to 1 instance on that box (or reduce the load), but it does look better than 6 boxes. On Tue, Dec 7, 2010 at 10:00 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > The major downside is you're going to want to let each instance have > its own dedicated commitlog spindle too, unless you just don't have > many updates. > > On Tue, Dec 7, 2010 at 8:25 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > > I am quite ready to be stoned for this thread but I have been thinking > > about this for a while and I just wanted to bounce these ideas of some > > guru's. > > > > Cassandra does allow multiple data directories, but as far as I can > > tell no one runs in this configuration. This is something that is very > > different between the hbase architecture and the Cassandra > > architecture. HBase borrows the concept from hadoop of JBOD > > configurations. HBase has many small ish (~256 MB) regions managed > > with Zookeeper. Cassandra has a few (1 per node) large node sized > > Token Ranges managed by Gossip consensus. > > > > Lets say a node has 6 300 GB disks. You have the options of RAID5, > > RAID6, RAID10, or RAID0. The problem I have found with these > > configurations are major compactions (of even large minor ones) can > > take a long time. Even if your disk is not heavily utilized this is a > > lot of data to move through. Thus node joins take a long time. Node > > moves take a long time. > > > > The idea behind "micrandra" is for a 6 disk system run 6 instances of > > Cassandra, one per disk. Use the RackAwareSnitch to make sure no > > replicas live on the same node. > > > > The downsides > > 1) we would have to manage 6x the instances of cassandra > > 2) we would have some overhead for each JVM. > > > > The upsides ? > > 1) Since disk/instance failure only degrades the overall performance > > 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when > > down a disk) > > 2) Moves and joins have less work to do > > 3) Can scale up a single node by adding a single disk to an existing > > system (assuming the ram and cpu is light) > > 4) OPP would be "easier" to balance out hot spots (maybe not on this > > one in not an OPP) > > > > What does everyone thing? Does it ever make sense to run this way? > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >