This is my personal experiences. MySQL is faster than Cassandra on most normal use cases.

You should understand why you choose Cassandra instead of MySQL. If one central MySQL can handle your workload, MySQL is better than Cassandra. BUT if you are overload one MySQL and want multiple boxes, Cassandra can be a solution for cheap, Cassandra provides fault tolerant, decentralized, durable and rich data model. It will not provide your high performance, especially reading performance is poor.

Digg failed to use Cassandra. You can check
http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/

This doesn't mean Cassandra is bad. You need design carefully to use Cassandra for your application and business model for success.



On Sep 15, 2010, at 12:06 PM, Wayne wrote:

If MySQL is faster then use it. I struggled to do side by side comparisons with Mysql for months until finally realizing they are too different to do side by side comparisons. Mysql is always faster out of the gate when you come at the problem thinking in terms of relational databases. Add in replication factor, using wider rows, dealing with databases that are 2-3 terabytes, tables with 3+ billions rows, etc. etc. The nosql "noise" out there should be ignored, and a solution like cassandra should be evaluated for what it brings to the table in terms of a technology that can solve the problems of big data and not how it does individual queries relative to mysql. If a "normal" database works for you use it!!

We have tested real loads using a 6 node cluster and consistently get 5ms reads under load. That is 200 reads/second (1 thread). Mysql is 10x faster, but then we also have wide rows and in that 5ms get 6 months of lots of different time series data which in the end means it is 10x faster than Mysql (1 thread). By embracing wide rows we turn slower into faster. Add in multiple threads/processes and the ability for a 20 node cluster to support concurrent reads and Mysql falls back in the dust. Also we don't have 300gb compressed backup files, we can easily add new nodes and grow, we can actually add columns dynamically without the dreaded ddl deadlock nightmare in mysql, and for once we have replication that just works.


On Wed, Sep 15, 2010 at 2:39 AM, Oleg Anastasyev <olega...@gmail.com> wrote:
Kamil Gorlo <kgs4242 <at> gmail.com> writes:

>
> So I've got more reads from single MySQL with 400GB of data than from
> 8 machines storing about 266GB. This doesn't look good. What am I
> doing wrong? :)

The worst case for cassandra is random reads. You should ask youself a question, do you really have this kind of workload in production ? If you really do, that means cassandra is not the right tool for the job. Some product based on berkeley db should work better, e.g. voldemort. Just plain old filesystem is also good for 100% random reads (if you dont need to backup of course).



Reply via email to