Ryan, I agree with you on the hot spots, however for the physical disk
performance, even the worst case hot spot is not worse than RAID0: in a hot
spot scenario, it might be that 90% of your reads go to one hard drive. But
with RAID0, 100% of your reads will go to *all* hard drives.

But you're right, individual disks might waste up to 50% of your total disk
space...

I came to consider this idea because Hadoop DFS explicitely recommends
different disks. But the design is not exactly the same, they don't have to
deal with very big files on the native FS layer.

-Roland



2010/4/26 Ryan King <r...@twitter.com>

> 2010/4/26 Roland Hänel <rol...@haenel.me>:
> > Hm... I understand that RAID0 would help to create a bigger pool for
> > compactions. However, it might impact read performance: if I have several
> > CF's (with their SSTables), random read requests for the CF files that
> are
> > on separate disks will behave nicely - however if it's RAID0 then a
> random
> > read on any file will create a random read on all of the hard disks.
> > Correct?
>
> Without RAID0 you will end up with host spots (a compaction could end
> up putting a large SSTable on one disk, while the others have smaller
> SSTables). If you have many CFs this might average out, but it might
> not and there are no guarantees here. I'd reccomend RAID0 unless you
> have reason to do something else.
>
> -ryan
>

Reply via email to