Re: Cassandra performance

Jeremy Hanna Fri, 17 Sep 2010 14:57:22 -0700

http://www.quora.com/Is-Cassandra-to-blame-for-Digg-v4s-technical-failures


On Sep 17, 2010, at 4:35 PM, Zhong Li wrote:

> This is my personal experiences. MySQL is faster than Cassandra on most 
> normal use cases.  
> 
> You should understand why you choose Cassandra instead of MySQL. If one 
> central MySQL can handle your workload, MySQL is better than Cassandra. BUT 
> if you are overload one MySQL and want multiple boxes, Cassandra can be a 
> solution for cheap, Cassandra  provides fault tolerant, decentralized, 
> durable and rich data model. It will not provide your high performance, 
> especially reading  performance is poor. 
> 
> Digg failed to use Cassandra. You can check
> http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/
> 
> This doesn't mean Cassandra is bad. You need design carefully to use 
> Cassandra for your application and business model for success.
> 
> 
>   
> On Sep 15, 2010, at 12:06 PM, Wayne wrote:
> 
>> If MySQL is faster then use it. I struggled to do side by side comparisons 
>> with Mysql for months until finally realizing they are too different to do 
>> side by side comparisons. Mysql is always faster out of the gate when you 
>> come at the problem thinking in terms of relational databases. Add in 
>> replication factor, using wider rows, dealing with databases that are 2-3 
>> terabytes, tables with 3+ billions rows, etc. etc. The nosql "noise" out 
>> there should be ignored, and a solution like cassandra should be evaluated 
>> for what it brings to the table in terms of a technology that can solve the 
>> problems of big data and not how it does individual queries relative to 
>> mysql. If a "normal" database works for you use it!!
>> 
>> We have tested real loads using a 6 node cluster and consistently get 5ms 
>> reads under load. That is 200 reads/second (1 thread). Mysql is 10x faster, 
>> but then we also have wide rows and in that 5ms get 6 months of lots of 
>> different time series data which in the end means it is 10x faster than 
>> Mysql (1 thread). By embracing wide rows we turn slower into faster. Add in 
>> multiple threads/processes and the ability for a 20 node cluster to support 
>> concurrent reads and Mysql falls back in the dust. Also we don't have 300gb 
>> compressed backup files, we can easily add new nodes and grow, we can 
>> actually add columns dynamically without the dreaded ddl deadlock nightmare 
>> in mysql, and for once we have replication that just works.
>> 
>> 
>> On Wed, Sep 15, 2010 at 2:39 AM, Oleg Anastasyev <olega...@gmail.com> wrote:
>> Kamil Gorlo <kgs4242 <at> gmail.com> writes:
>> 
>> >
>> > So I've got more reads from single MySQL with 400GB of data than from
>> > 8 machines storing about 266GB. This doesn't look good. What am I
>> > doing wrong? :)
>> 
>> The worst case for cassandra is random reads. You should ask youself a 
>> question,
>> do you really have this kind of workload in production ? If you really do, 
>> that
>> means cassandra is not the right tool for the job. Some product based on
>> berkeley db should work better, e.g. voldemort. Just plain old filesystem is
>> also good for 100% random reads (if you dont need to backup of course).
>> 
>> 
>

Re: Cassandra performance

Reply via email to