Re: Cassandra performance

Zhong Li Fri, 17 Sep 2010 14:36:15 -0700

This is my personal experiences. MySQL is faster than Cassandra onmost normal use cases.

You should understand why you choose Cassandra instead of MySQL. Ifone central MySQL can handle your workload, MySQL is better thanCassandra. BUT if you are overload one MySQL and want multiple boxes,Cassandra can be a solution for cheap, Cassandra provides faulttolerant, decentralized, durable and rich data model. It will notprovide your high performance, especially reading performance is poor.


Digg failed to use Cassandra. You can check
http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/

This doesn't mean Cassandra is bad. You need design carefully to useCassandra for your application and business model for success.




On Sep 15, 2010, at 12:06 PM, Wayne wrote:

If MySQL is faster then use it. I struggled to do side by sidecomparisons with Mysql for months until finally realizing they aretoo different to do side by side comparisons. Mysql is always fasterout of the gate when you come at the problem thinking in terms ofrelational databases. Add in replication factor, using wider rows,dealing with databases that are 2-3 terabytes, tables with 3+billions rows, etc. etc. The nosql "noise" out there should beignored, and a solution like cassandra should be evaluated for whatit brings to the table in terms of a technology that can solve theproblems of big data and not how it does individual queries relativeto mysql. If a "normal" database works for you use it!!
We have tested real loads using a 6 node cluster and consistentlyget 5ms reads under load. That is 200 reads/second (1 thread). Mysqlis 10x faster, but then we also have wide rows and in that 5ms get 6months of lots of different time series data which in the end meansit is 10x faster than Mysql (1 thread). By embracing wide rows weturn slower into faster. Add in multiple threads/processes and theability for a 20 node cluster to support concurrent reads and Mysqlfalls back in the dust. Also we don't have 300gb compressed backupfiles, we can easily add new nodes and grow, we can actually addcolumns dynamically without the dreaded ddl deadlock nightmare inmysql, and for once we have replication that just works.
On Wed, Sep 15, 2010 at 2:39 AM, Oleg Anastasyev<olega...@gmail.com> wrote:
Kamil Gorlo <kgs4242 <at> gmail.com> writes:

>
> So I've got more reads from single MySQL with 400GB of data thanfrom
> 8 machines storing about 266GB. This doesn't look good. What am I
> doing wrong? :)
The worst case for cassandra is random reads. You should ask youselfa question,do you really have this kind of workload in production ? If youreally do, thatmeans cassandra is not the right tool for the job. Some productbased onberkeley db should work better, e.g. voldemort. Just plain oldfilesystem isalso good for 100% random reads (if you dont need to backup ofcourse).

Re: Cassandra performance

Reply via email to