Duy,
if you are not already working for Datastax, they should hire you. :)

Great response. You have given me some good points to think about.  I will
do the rest of the research.

Thanks.





On Fri, Jul 4, 2014 at 10:10 PM, DuyHai Doan <doanduy...@gmail.com> wrote:

> I would answer your question this way:
>
> 1) Why should I choose C* ?
>
>  a. linear scalability, throughputs scale "almost" linearly with number of
> nodes
>
>  b. almost unbounded extensivity (there is no limit, or at least  huge
> limit in term of number of nodes you can have on a cluster)
>
>  c. operational simplicity due to master-less architecture. This feature
> is, although quite transparent for developers, is a key selling point.
> Having suffered when installing manually a Hadoop cluster, I happen to love
> the deployment simplicity of C*, only one process per node, no moving parts.
>
> d. high availability. C* trades consistency for availability clearly so
> you can expect to have something like 99.99% of uptime. Very selling point
> for critical business which need to be up all the time
>
> e. support for multi data centers out of the box. Again, on the
> operational side, it's a great feature if you plan a worldwide deployment
>
> That's all I can see for now
>
> 2) Why shouldn't I choose C* ?
>
> a. need for a strong consistency most of the time. Although you can
> perform all requests  with Consistency level ALL, it's clearly not the best
> use of C*. You'll suffer for higher latency and reduced availability. Even
> the new "lightweight transaction" feature is not meant to be use on large
> scale
>
> b. very complicated and changing queries. Denormalizing is great when you
> know ahead of time exactly how you'll query your data. Once done, any new
> way of querying will require new coding & new tables to support it
>
> c. ridiculous data load. I've seen people in prod using C* for only 200Gb
> because they want to be trendy and use bleeding edge technologies. They'd
> better off using a classical RDBMS solution that fit perfectly their load
>
> Hope that helps
>
> Duy Hai DOAN
>
>
>
> On Fri, Jul 4, 2014 at 9:31 PM, Prem Yadav <ipremya...@gmail.com> wrote:
>
>> Thanks Manoj. Great post for those who already have Cassandra in
>> production.
>> However it brings me back to my original post.
>> All the points you have mentioned apply to any big data technology.
>> Storage- All of them
>> Query- All of them. In fact lot of them perform better. Agree that CQL
>> structure is better. But hive,mongo all good
>> Availability- many of them
>>
>> So my question is basically to Cassandra support people e.g.- Datastax Or
>> the developers.
>> What makes Cassandra special.
>> If I have to convince my CTO to spend million dollars on a cluster and
>> support, his first question would be why Cassandra? Why not this or that?
>>
>> So I still am not sure about what special Cassandra brings to the table?
>>
>> Sorry about the rant. But in the enterprise world, decisions are taken
>> based on taking into account the stability, convincing managers and what
>> not. Chosen technology has to be stable for years. People should be
>> convinced that the engineers are not going to do a lot of firefighting.
>>
>> Any inputs appreciated.
>>
>>
>>
>> On Fri, Jul 4, 2014 at 7:07 PM, Manoj Khangaonkar <khangaon...@gmail.com>
>> wrote:
>>
>>> These are my personal opinions based on few months using Cassandra.
>>> These are my views. Others
>>>  may have different opinion
>>>
>>>
>>>
>>> http://khangaonkar.blogspot.com/2014/06/apache-cassandra-things-to-consider.html
>>>
>>> regards
>>>
>>>
>>>
>>> On Fri, Jul 4, 2014 at 7:37 AM, Prem Yadav <ipremya...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I have seen this in a lot of replies that Cassandra is not designed for
>>>> this and that. I don't want to sound rude, i just need some info about this
>>>> so that i can compare it to technologies like hbase, mongo, elasticsearch, 
>>>> solr,
>>>> etc.
>>>>
>>>> 1) what is Cassandra designed for. Heave writes yes. So is Hbase. Or
>>>> ElasticSearch
>>>> What is the use case(s) that suit Cassandra.
>>>>
>>>> 2) What kind of queries are best suited for Cassandra.
>>>> I ask this Because I have seen people asking about queries and getting
>>>> replies that its not suited for Cassandra. For ex: queries where large
>>>> number of rows are requested and timeout happens. Or range queries or
>>>> aggregate queries.
>>>>
>>>> 3) Where does Cassandra excel compared to other technologies?
>>>>
>>>> I have been working on Casandra for some time. I know how it works and
>>>> I like it very much.
>>>> We are moving towards building a big cluster. But at this point, I am
>>>> not sure if its a right decision.
>>>>
>>>> A lot of people including me like Cassandra in my company. But it has
>>>> more to do with the CQL and not the internals or the use cases. Until now,
>>>> there have been small PoCs and people enjoyed it. But a large scale
>>>> project, we are not so sure.
>>>>
>>>> Please guide us.
>>>> Please note that the drawbacks of other technologies do not interest
>>>> me, its the strengths/weaknesses of Cassandra I am interested in.
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> http://khangaonkar.blogspot.com/
>>>
>>
>>
>

Reply via email to