Ryan, I agree with you on the hot spots, however for the physical disk performance, even the worst case hot spot is not worse than RAID0: in a hot spot scenario, it might be that 90% of your reads go to one hard drive. But with RAID0, 100% of your reads will go to *all* hard drives.
But you're right, individual disks might waste up to 50% of your total disk space... I came to consider this idea because Hadoop DFS explicitely recommends different disks. But the design is not exactly the same, they don't have to deal with very big files on the native FS layer. -Roland 2010/4/26 Ryan King <r...@twitter.com> > 2010/4/26 Roland Hänel <rol...@haenel.me>: > > Hm... I understand that RAID0 would help to create a bigger pool for > > compactions. However, it might impact read performance: if I have several > > CF's (with their SSTables), random read requests for the CF files that > are > > on separate disks will behave nicely - however if it's RAID0 then a > random > > read on any file will create a random read on all of the hard disks. > > Correct? > > Without RAID0 you will end up with host spots (a compaction could end > up putting a large SSTable on one disk, while the others have smaller > SSTables). If you have many CFs this might average out, but it might > not and there are no guarantees here. I'd reccomend RAID0 unless you > have reason to do something else. > > -ryan >