“Is cassandra only for use cases with data load > 100TB and massive user 
counts?”

I wouldn’t make that extreme a statement! There are plenty of more moderate use 
cases for Cassandra. For example, a dozen nodes with 300 GB per node for just a 
few million users and their interactions and transactions.

I would say that as a rough rule of thumb that a traditional RDBMS is great for 
up to low millions of rows, and Cassandra is clearly needed when you have more 
than a few hundred millions of rows. In between, it becomes a more subjective 
choice.

Tens of millions of rows can probably be dealt with effectively by an RDBMS, 
but... you’re starting to have to be careful and configure high-end systems and 
manage them carefully. 100 million rows? Sure, you could still do that on an 
RDBMS if you are motivated and put in the effort. For example, some relational 
databases may require manual partitioning when you have more than 25 million 
rows or so. And then you have to pay attention to query latency as well.

First big question: It may be 100 million rows today, but what growth rate do 
you anticipate?

-- Jack Krupansky

From: Matthias Hübner 
Sent: Saturday, July 5, 2014 5:49 AM
To: user@cassandra.apache.org 
Subject: Re: Cassandra use cases/Strengths/Weakness

Hi,

i am a bit confused if cassandra is a choice for my use case especially after 
reading this thread.


Is cassandra only for use cases with data load > 100TB and massive user counts?


What about all the other features of cassandra, are they not useable to avoid 
limitations of relational databases, even for smaller use cases?


What do you think for my use case:


I need to manage data data for around 1000 retail stores to produce each day a 
delivery plan (including predictions several weeks in the future) to refill the 
stores. For each store I have to collect data about every single store item. A 
store has some 10 thousand items. This makes around 100 million items to 
manage. Each day I have store some updates for every single store item. Also I 
receive for all items sale predictions day by day. Every day I have to produce 
one ore more delivery plans. Most data will replace old data, so its not 
increasing that much. 

I thought i can handle data load easier with cassandra than with mariadb. I 
don’t have to care about locking, I could write all incoming data and merge 
into my tables. And I could use aggregations. So I would be able to add all 
store item related data together that I need to compute my delivery plans. 
Finally I would be able to use commodity hardware and can scale easier.




Have a nice weekend,

Matthias








2014-07-05 0:37 GMT+02:00 Jack Krupansky <j...@basetechnology.com>:

  Elasticsearch and Solr are “search platforms”, not “databases”. The best 
description for Cassandra, especially for a CTO, is its home page:
  http://cassandra.apache.org/
  Even if you have seen it before, please read it again. There is a lot packed 
into a few words.

  DataStax Enterprise (DSE) combines Cassandra, Hadoop and Spark for analytics, 
and tightly integrated Solr for rich search of the Cassandra data.

  The main, biggest benefit of Cassandra is that it is a master-free 
distributed real-time database designed for scale, including support for 
multiple data centers, so that it is ready for managing mission critical 
operational data, for applications that need low latency and high availability 
for real-time data access.

  And OpsCenter is great for managing a Cassandra or DSE cluster. I’m sure a 
CTO would appreciate it:
  http://www.datastax.com/what-we-offer/products-services/datastax-opscenter

  Here’s a feature comparison of some NoSQL databases:
  http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

  -- Jack Krupansky

  From: Prem Yadav 
  Sent: Friday, July 4, 2014 10:37 AM
  To: user@cassandra.apache.org 
  Subject: Cassandra use cases/Strengths/Weakness

  Hi,
  I have seen this in a lot of replies that Cassandra is not designed for this 
and that. I don't want to sound rude, i just need some info about this so that 
i can compare it to technologies like hbase, mongo, elasticsearch, solr, etc. 

  1) what is Cassandra designed for. Heave writes yes. So is Hbase. Or 
ElasticSearch
  What is the use case(s) that suit Cassandra.

  2) What kind of queries are best suited for Cassandra.
  I ask this Because I have seen people asking about queries and getting 
replies that its not suited for Cassandra. For ex: queries where large number 
of rows are requested and timeout happens. Or range queries or aggregate 
queries.



  3) Where does Cassandra excel compared to other technologies?

  I have been working on Casandra for some time. I know how it works and I like 
it very much. 
  We are moving towards building a big cluster. But at this point, I am not 
sure if its a right decision. 

  A lot of people including me like Cassandra in my company. But it has more to 
do with the CQL and not the internals or the use cases. Until now, there have 
been small PoCs and people enjoyed it. But a large scale project, we are not so 
sure.

  Please guide us.
  Please note that the drawbacks of other technologies do not interest me, its 
the strengths/weaknesses of Cassandra I am interested in.
  Thanks


   







Reply via email to