Re: Cassandra & usage

aaron morton Thu, 26 Jan 2012 12:36:53 -0800

Yes it is.

But it depends on what you want to select.


Cassandra does not have a complete query language like SQL in an RDBMS. You 
need to design your data model to support the queries you wish to make. 
Normally this means denormlising data so that queries are essentially reading 
from a materialized view.

So if you store users by user id, you can insert/update and select users by 
user id. 

If you want so select all users who live in Spain then you need to have a 
Secondary Index on the country column. Secondary indexes have overheads and 
limitations just like indexes in a RDBMS. So if this is a common query may want 
to denomalise the data so users are stored by user id and country. 

If one day you decide to select all users who are older than 30 but younger 
than 40 you have to do some extra work. You could add another index on the 
birthdate, but secondary indexes queries must have an equality clause so you 
cannot do birthdate > x and birthday < y. This is where other query languages 
such as HIVE or PIG and HADOOP come into play. 

Hope that helps. 
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 26/01/2012, at 4:13 PM, francesco.tangari....@gmail.com wrote:

> i don't get it. Suppose i have a data model and i have million of rows and 
> suppose i want perform some select and some insert , it is not feasible to 
> use cassandra for those reasons?
> 
> -- 
> francesco.tangari....@gmail.com
> Inviato con Sparrow
> 
> Il giorno mercoledì 25 gennaio 2012, alle ore 09.13, aaron morton ha scritto:
> 
>> You data load is fine. 
>> 
>> It sounds like you will run into issues with the data model and 
>> functionality of cassandra. "Standard Analysis" in the RDBMS sense of 
>> throwing any ad-hoc query at the data and letting the query engine work it 
>> out is not possible without using HIVE/PIG or some other query language.
>> 
>> You will need to understand what sort of questions you want from the data up 
>> from. The best way to learn this lesson is put together and quick prototype 
>> and see how the data model works. 
>> 
>> Hope that helps. 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 25/01/2012, at 8:09 PM, francesco.tangari....@gmail.com wrote:
>> 
>>> make example of cases please?
>>> 
>>> -- 
>>> francesco.tangari....@gmail.com
>>> Inviato con Sparrow
>>> 
>>> Il giorno mercoledì 25 gennaio 2012, alle ore 05.29, Gustavo Gustavo ha 
>>> scritto:
>>> 
>>>> That's for sure not much.
>>>> Your rdbms can probably hold the entire dataset in memory, and you can do 
>>>> all kinds for queries that you want. Cassandra is for some very specific 
>>>> use cases. 
>>>> If you really need a cluster, have you thought about MySQL Cluster? 
>>>> 
>>>> 2012/1/25 <francesco.tangari....@gmail.com>
>>>>> Standard analysis,  display or aggregate some rows
>>>>> or standard operations that i can do on a normal dbms
>>>>> 
>>>>> -- 
>>>>> francesco.tangari....@gmail.com
>>>>> Inviato con Sparrow
>>>>> 
>>>>> Il giorno mercoledì 25 gennaio 2012, alle ore 04.26, Maxim Potekhin ha 
>>>>> scritto:
>>>>> 
>>>>>> You provide zero information on what you are planning to do with the 
>>>>>> data.
>>>>>> Thus, your question is impossible to answer.
>>>>>> 
>>>>>> 
>>>>>> On 1/24/2012 9:38 PM, francesco.tangari....@gmail.com wrote:
>>>>>>> 
>>>>>>> Do you think that for a standard project with 50.000.000 of rows on 2-3 
>>>>>>> machines cassandra is appropriate 
>>>>>>> or i should use a normal dbms?
>>>>>>> 
>>>>>>> -- 
>>>>>>> francesco.tangari....@gmail.com
>>>>>>> Inviato con Sparrow
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>

Re: Cassandra & usage

Reply via email to