We run a cluster in EC2 and it's working very well for us.  The standard
seems to be M2.2XLarge instances with data living on the ephemeral drives
(which means its local and fast) and backups either to EBS, S3 or just
relying on cluster size and replication (we avoid that last idea).

Brian


On Sun, Aug 4, 2013 at 9:02 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> If you want to get a rough idea of how things will perform, fire up YCSB (
> https://github.com/brianfrankcooper/YCSB/wiki) and run the tests that
> closest match how you think your workload will be (run the test clients
> from a couple of beefy AWS spot-instances for less than a dollar). As you
> are a new startup without any existing load/traffic patterns, benchmarking
> will be your best bet.
>
> As a have a look at running Cassandra with SmartOS on Joyent. When you run
> SmartOS on Joyent virtualisation is done using solaris zones, an OS based
> virtualisation, which is at least a quadrillion times better than KVM, xen
> etc.
>
> Ok maybe not that much… but it is pretty cool and has the following
> benefits:
>
> - No hardware emulation.
> - Shared kernel with the host (you don't have to waste precious memory
> running a guest os).
> - ZFS :)
>
> Have a read of http://wiki.smartos.org/display/DOC/SmartOS+Virtualization for
> more info.
>
> There are some downsides as well:
>
> The version of Cassandra that comes with the SmartOS package management
> system is old and busted, so you will want to build from source.
> You will want to be technically confident in running on something a little
> outside the norm (SmartOS is based on Solaris).
>
> Just make sure you test and benchmark all your options, a few days of
> testing now will save you weeks of pain.
>
> Good luck!
>
> Ben Bromhead
> Instaclustr | www.instaclustr.com | 
> @instaclustr<http://twitter.com/instaclustr>
>
>
>
> On 05/08/2013, at 12:34 AM, David Schairer <dschai...@humbaba.net> wrote:
>
> Of course -- my point is simply that if you're looking for speed, SSD+KVM,
> especially in a shared tenant situation, is unlikely to perform the way you
> want to.  If you're building a pure proof of concept that never stresses
> the system, it doesn't matter, but if you plan an MVP with any sort of
> scale, you'll want a plan to be on something more robust.
>
> I'll also say that it's really important (imho) to be doing even your dev
> in a config where you have consistency conditions like eventual production
> -- so make sure you're writing to both nodes and can have cases where
> eventual consistency delays kick in, or it'll come back to bite you later
> -- I've seen this force people to redesign their whole data model when they
> don't plan for it initially.
>
> As I said, I haven't tested DO.  I've tested very similar configurations
> at other providers and they were all terrible under load -- and certainly
> took away most of the benefits of SSD once you stressed writes a bit.
>  XEN+SSD, on modern kernels, should work better, but I didn't test it
> (linode doesn't offer this, though, and they've had lots of other
> challenges of late).
>
> --DRS
>
> On Aug 3, 2013, at 11:40 PM, Ertio Lew <ertio...@gmail.com> wrote:
>
> @David:
> Like all other start-ups, we too cannot start with all dedicated servers
> for Cassandra. So right now we have no better choice except for using a VPS
> :), but we can definitely choose one from amongst a suitable set of VPS
> configurations. As of now since we are starting out, could we initiate our
> cluster with 2 nodes(RF=2), (KVM, 2GB ram, 2 cores, 30GB SDD) . Right now
> we wont we having a very heavy load on Cassandra until a next few months
> till we grow our user base. So, this choice is mainly based on the pricing
> vs configuration as well as digital ocean's good reputation in the
> community.
>
>
> On Sun, Aug 4, 2013 at 12:53 AM, David Schairer <dschai...@humbaba.net>
> wrote:
> I've run several lab configurations on linodes; I wouldn't run cassandra
> on any shared virtual platform for large-scale production, just because
> your IO performance is going to be really hard to predict.  Lots of people
> do, though -- depends on your cassandra loads and how consistent you need
> to have performance be, as well as how much of your working set will fit
> into memory.  Remember that linode significantly oversells their CPU as
> well.
>
> The release version of KVM, at least as of a few months ago, still doesn't
> support TRIM on SSD; that, plus the fact that you don't know how others
> will use SSDs or if their file systems will keep the SSDs healthy, means
> that SSD performance on KVM is going to be highly unpredictable.  I have
> not tested digitalocean, but I did test several other KVM+SSD shared-tenant
> hosting providers aggressively for cassandra a couple months ago; they all
> failed badly.
>
> Your mileage will vary considerably based on what you need out of
> cassandra, what your data patterns look like, and how you configure your
> system.  That said, I would use xen before KVM for high-performance IO.
>
> I have not run Cassandra in any volume on Amazon -- lots of folks have,
> and may have recommendations (including SSD) there for where it falls on
> the price/performance curve.
>
> --DRS
>
> On Aug 3, 2013, at 11:33 AM, Ertio Lew <ertio...@gmail.com> wrote:
>
> I am building a cluster(initially starting with a 2-3 nodes cluster). I
> have came across two seemingly good options for hosting, Linode & Digital
> Ocean. VPS configuration for both listed below:
>
>
> Linode:-
> ------------------
> XEN Virtualization
> 2 GB RAM
> 8 cores CPU (2x priority) (8 processor Xen instances)
> 96 GB Storage
>
>
> Digital Ocean:-
> -------------------------
> KVM Virtualization
> 2GB Memory
> 2 Cores
> 40GB **SSD Disk***
> Digitial Ocean's VPS is at half price of above listed Linode VPS,
>
>
> Could you clarify which of these two VPS would be better as Cassandra
> nodes ?
>
>
>
>
>
>
>

Reply via email to