http://www.quora.com/Is-Cassandra-to-blame-for-Digg-v4s-technical-failures
On Sep 17, 2010, at 4:35 PM, Zhong Li wrote: > This is my personal experiences. MySQL is faster than Cassandra on most > normal use cases. > > You should understand why you choose Cassandra instead of MySQL. If one > central MySQL can handle your workload, MySQL is better than Cassandra. BUT > if you are overload one MySQL and want multiple boxes, Cassandra can be a > solution for cheap, Cassandra provides fault tolerant, decentralized, > durable and rich data model. It will not provide your high performance, > especially reading performance is poor. > > Digg failed to use Cassandra. You can check > http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/ > > This doesn't mean Cassandra is bad. You need design carefully to use > Cassandra for your application and business model for success. > > > > On Sep 15, 2010, at 12:06 PM, Wayne wrote: > >> If MySQL is faster then use it. I struggled to do side by side comparisons >> with Mysql for months until finally realizing they are too different to do >> side by side comparisons. Mysql is always faster out of the gate when you >> come at the problem thinking in terms of relational databases. Add in >> replication factor, using wider rows, dealing with databases that are 2-3 >> terabytes, tables with 3+ billions rows, etc. etc. The nosql "noise" out >> there should be ignored, and a solution like cassandra should be evaluated >> for what it brings to the table in terms of a technology that can solve the >> problems of big data and not how it does individual queries relative to >> mysql. If a "normal" database works for you use it!! >> >> We have tested real loads using a 6 node cluster and consistently get 5ms >> reads under load. That is 200 reads/second (1 thread). Mysql is 10x faster, >> but then we also have wide rows and in that 5ms get 6 months of lots of >> different time series data which in the end means it is 10x faster than >> Mysql (1 thread). By embracing wide rows we turn slower into faster. Add in >> multiple threads/processes and the ability for a 20 node cluster to support >> concurrent reads and Mysql falls back in the dust. Also we don't have 300gb >> compressed backup files, we can easily add new nodes and grow, we can >> actually add columns dynamically without the dreaded ddl deadlock nightmare >> in mysql, and for once we have replication that just works. >> >> >> On Wed, Sep 15, 2010 at 2:39 AM, Oleg Anastasyev <olega...@gmail.com> wrote: >> Kamil Gorlo <kgs4242 <at> gmail.com> writes: >> >> > >> > So I've got more reads from single MySQL with 400GB of data than from >> > 8 machines storing about 266GB. This doesn't look good. What am I >> > doing wrong? :) >> >> The worst case for cassandra is random reads. You should ask youself a >> question, >> do you really have this kind of workload in production ? If you really do, >> that >> means cassandra is not the right tool for the job. Some product based on >> berkeley db should work better, e.g. voldemort. Just plain old filesystem is >> also good for 100% random reads (if you dont need to backup of course). >> >> >