I would answer your question this way:

1) Why should I choose C* ?

 a. linear scalability, throughputs scale "almost" linearly with number of
nodes

 b. almost unbounded extensivity (there is no limit, or at least  huge
limit in term of number of nodes you can have on a cluster)

 c. operational simplicity due to master-less architecture. This feature
is, although quite transparent for developers, is a key selling point.
Having suffered when installing manually a Hadoop cluster, I happen to love
the deployment simplicity of C*, only one process per node, no moving parts.

d. high availability. C* trades consistency for availability clearly so you
can expect to have something like 99.99% of uptime. Very selling point for
critical business which need to be up all the time

e. support for multi data centers out of the box. Again, on the operational
side, it's a great feature if you plan a worldwide deployment

That's all I can see for now

2) Why shouldn't I choose C* ?

a. need for a strong consistency most of the time. Although you can perform
all requests  with Consistency level ALL, it's clearly not the best use of
C*. You'll suffer for higher latency and reduced availability. Even the new
"lightweight transaction" feature is not meant to be use on large scale

b. very complicated and changing queries. Denormalizing is great when you
know ahead of time exactly how you'll query your data. Once done, any new
way of querying will require new coding & new tables to support it

c. ridiculous data load. I've seen people in prod using C* for only 200Gb
because they want to be trendy and use bleeding edge technologies. They'd
better off using a classical RDBMS solution that fit perfectly their load

Hope that helps

Duy Hai DOAN



On Fri, Jul 4, 2014 at 9:31 PM, Prem Yadav <ipremya...@gmail.com> wrote:

> Thanks Manoj. Great post for those who already have Cassandra in
> production.
> However it brings me back to my original post.
> All the points you have mentioned apply to any big data technology.
> Storage- All of them
> Query- All of them. In fact lot of them perform better. Agree that CQL
> structure is better. But hive,mongo all good
> Availability- many of them
>
> So my question is basically to Cassandra support people e.g.- Datastax Or
> the developers.
> What makes Cassandra special.
> If I have to convince my CTO to spend million dollars on a cluster and
> support, his first question would be why Cassandra? Why not this or that?
>
> So I still am not sure about what special Cassandra brings to the table?
>
> Sorry about the rant. But in the enterprise world, decisions are taken
> based on taking into account the stability, convincing managers and what
> not. Chosen technology has to be stable for years. People should be
> convinced that the engineers are not going to do a lot of firefighting.
>
> Any inputs appreciated.
>
>
>
> On Fri, Jul 4, 2014 at 7:07 PM, Manoj Khangaonkar <khangaon...@gmail.com>
> wrote:
>
>> These are my personal opinions based on few months using Cassandra. These
>> are my views. Others
>> may have different opinion
>>
>>
>>
>> http://khangaonkar.blogspot.com/2014/06/apache-cassandra-things-to-consider.html
>>
>> regards
>>
>>
>>
>> On Fri, Jul 4, 2014 at 7:37 AM, Prem Yadav <ipremya...@gmail.com> wrote:
>>
>>> Hi,
>>> I have seen this in a lot of replies that Cassandra is not designed for
>>> this and that. I don't want to sound rude, i just need some info about this
>>> so that i can compare it to technologies like hbase, mongo, elasticsearch, 
>>> solr,
>>> etc.
>>>
>>> 1) what is Cassandra designed for. Heave writes yes. So is Hbase. Or
>>> ElasticSearch
>>> What is the use case(s) that suit Cassandra.
>>>
>>> 2) What kind of queries are best suited for Cassandra.
>>> I ask this Because I have seen people asking about queries and getting
>>> replies that its not suited for Cassandra. For ex: queries where large
>>> number of rows are requested and timeout happens. Or range queries or
>>> aggregate queries.
>>>
>>> 3) Where does Cassandra excel compared to other technologies?
>>>
>>> I have been working on Casandra for some time. I know how it works and I
>>> like it very much.
>>> We are moving towards building a big cluster. But at this point, I am
>>> not sure if its a right decision.
>>>
>>> A lot of people including me like Cassandra in my company. But it has
>>> more to do with the CQL and not the internals or the use cases. Until now,
>>> there have been small PoCs and people enjoyed it. But a large scale
>>> project, we are not so sure.
>>>
>>> Please guide us.
>>> Please note that the drawbacks of other technologies do not interest me,
>>> its the strengths/weaknesses of Cassandra I am interested in.
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> http://khangaonkar.blogspot.com/
>>
>
>

Reply via email to