> As a counter argument though, anyone running a C* cluster on the Amazon cloud is going to be using SAN storage (or some kind of proprietary storage array) at the lowest layers...Amazon isn't going to have a bunch of JBOD running their cloud infrastructure. However, they've invested in the infrastructure to do it right.
This is certainly true when using EBS, however it's generally not recommended to use EBS when running Cassandra. EBS has proven to be unreliable in the past and it's a bit of a SPOF. Instead, it's recommended to use the "instance store" disks that come with most instances (handy chart here: http://www.ec2instances.info/). These are the rough equivalent of local disks (probably host level RAID 10 storage if I'd have to guess.) -Jared On 22 February 2013 00:40, Michael Morris <michael.m.mor...@gmail.com>wrote: > I'm running a 27 node cassandra cluster on SAN without issue. I will be > perfectly clear though, the hosts are multi-homed to different > switches/fabrics in the SAN, we have an _expensive_ EMC array, and other > than a datacenter-wide power outage, there's no SPOF for the SAN. We use > it because it's there, and it's already a sunk cost. > > I certainly would not go out of my way to purchase SAN infrastructure for > a C* cluster, it just doesn't make sense (for all the reasons others have > mentioned). Any more, you can load up a single 2U server with multi-TB > worth of disk, so the aggregate storage capacity of your C* cluster could > potentially be as much as a SAN you would purchase (and a lot less hassle > too). > > As a counter argument though, anyone running a C* cluster on the Amazon > cloud is going to be using SAN storage (or some kind of proprietary storage > array) at the lowest layers...Amazon isn't going to have a bunch of JBOD > running their cloud infrastructure. However, they've invested in the > infrastructure to do it right. > > - Mike > > > On Thu, Feb 21, 2013 at 6:08 PM, P. Taylor Goetz <ptgo...@gmail.com>wrote: > >> I shouldn't have used the word "spinning"... SSDs are a great option as >> well. >> >> I also agree with all the "expensive SPOF" points others have made. >> >> Sent from my iPhone >> >> On Feb 21, 2013, at 6:56 PM, "P. Taylor Goetz" <ptgo...@gmail.com> wrote: >> >> Cassandra is designed to write and read data in a way that is optimized >> for physical spinning disks. >> >> Running C* on a SAN introduces a layer of abstraction that, at best >> negates those optimizations, and at worst introduces additional overhead. >> >> Sent from my iPhone >> >> On Feb 21, 2013, at 6:42 PM, Kanwar Sangha <kan...@mavenir.com> wrote: >> >> Ok. What would be the drawbacks J**** >> >> ** ** >> >> *From:* Michael Kjellman >> [mailto:mkjell...@barracuda.com<mkjell...@barracuda.com>] >> >> *Sent:* 21 February 2013 17:12 >> *To:* user@cassandra.apache.org >> *Subject:* Re: Cassandra with SAN**** >> >> ** ** >> >> No, this is a really really bad idea and C* was not designed for this, in >> fact, it was designed so you don't need to have a large expensive SAN.*** >> * >> >> ** ** >> >> Don't be tempted by the shiny expensive SAN. :)**** >> >> ** ** >> >> If money is no object instead throw SSD's in your nodes and run 10G >> between racks**** >> >> ** ** >> >> *From: *Kanwar Sangha <kan...@mavenir.com> >> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Date: *Thursday, February 21, 2013 2:56 PM >> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Subject: *Cassandra with SAN**** >> >> ** ** >> >> Hi – Is it a good idea to use Cassandra with SAN ? Say a SAN which >> provides me 8 Petabytes of storage. Would I not be I/O bound irrespective >> of the no of Cassandra machines and scaling by adding **** >> >> machines won’t help ?**** >> >> **** >> >> Thanks**** >> >> Kanwar**** >> >> ** ** >> >> ---------------------------------- >> Copy, by Barracuda, helps you store, protect, and share all your amazing >> things. Start today: www.copy.com <http://www.copy.com?a=em_footer>. **** >> >> **** >> >> >