> later i will need to apply filter to that data, 
Sounds like a read query you should support by denormalising the data. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 25/08/2011, at 10:50 PM, Helder Oliveira wrote:

> Hello,
> 
> thanks for your time.
> 
> I have suggested a SCF but i am still testing the system with CF, making some 
> tests and testing the data flow ( insert / select ).
> 
> Making subdata as JSON already came into my mind, but it's not possible 
> because later i will need to apply filter to that data, and if it is in JSON 
> i need to fetch all and filter on the programming side. Correct me if i am 
> wrong.
> 
> Well i will continue the tests with CF, things are getting more clear for me 
> now.
> 
> Thanks a lot guys for answer and spending time with some newbie questions :)
> 
> 
> On Aug 24, 2011, at 11:27 PM, aaron morton wrote:
> 
>> I normally suggest trying a model with Standard CF's first as there are some 
>> down sides to super CF's. If you know there will only be a few sub columns 
>> there are probably OK (see 
>> http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative 
>> design is fine. Test it out and see what works for you. 
>> 
>> Also (and I know not everyone agrees) depending on the use case it's ok to 
>> blob data up. Cassandra does not *need* to know about the individual 
>> properties of your entities. By that I mean there is not a query planner 
>> that can make better decisions about how to execute your query based on data 
>> types and distributions, or how what types columns should have in 
>> projections. 
>> 
>> So an alternative here is to collapse VisitantSessions and Sessions into 
>> one, and store the session data as a JSON (or similar) blob in the column 
>> value. This works best if you do not need to concurrently update fields in 
>> the entity. So if you write the session data once, or if you *always* only 
>> update from a single thread / process. Or if your data is designed to be 
>> overwritten. 
>> 
>> Cheers
>> 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 25/08/2011, at 1:15 AM, Helder Oliveira wrote:
>> 
>>> Thanks Indranath Ghosh for your tip!
>>> 
>>> I will continue here the question.
>>> 
>>> Aaron, i have read your suggestion and tried to design your suggestion and 
>>> i have one question regarding it.
>>> 
>>> Let's forget for now the Requests and Events!
>>> 
>>> Just keep the Visitants and the Sessions.
>>> 
>>> My goal is when having a visitant get all informations about him, in this 
>>> case all his past sessions so i can create a profile using his past.
>>> 
>>> You have suggested 3 CF's
>>> 
>>> Visitants CF
>>> key: id
>>> cn: pn
>>> cv: pv
>>> 
>>> Visitant Sessions CF
>>> key: visitant id
>>> cn: session id
>>> cv: none
>>> 
>>> Sessions CF
>>> key: session id
>>> cn: pn
>>> cv: pv
>>> 
>>> When i need to know everything about one visitant, i need to query 
>>> "Visitant Sessions CF" to get all keys, and then query "Sessions CF" for 
>>> all keys properties.
>>> 
>>> In this case, applying a Super Column Family to the "Sessions" isn't better 
>>> ?
>>> 
>>> I mean something like:
>>> 
>>> {
>>>   "sessions": {
>>> 
>>>     "visitant id 1": {
>>>       "session id 1": {
>>>         "p1": {"p1": "v1"},
>>>         "p2": {"jira": "v2"}
>>>       },
>>>       "session id 2": {
>>>         "p1": {"p1": "v1"}
>>>       }
>>>     }
>>>     
>>>     "visitant id 2": {
>>>       "session id 3": {
>>>         "p1": {"p1": "v1"},
>>>         "p2": {"jira": "v2"}
>>>       },
>>>       "session id 4": {
>>>         "p1": {"p1": "v1"}
>>>       }
>>>     }
>>>   }
>>> }
>>> 
>>> Using this, i can get all sessions in the second query, instead of having 
>>> all sessions only at third query.
>>> 
>>> Regarding your notes, the Visitant CF will be almost unchangeable since the 
>>> beginning of his creation, the sessions will be added every time a known 
>>> user visits back, ceasing a new sessions.
>>> 
>>> Thanks a lot for you help guys, and i hope i was not saying crazy things :D
>>> 
>>> On Aug 22, 2011, at 11:23 PM, aaron morton wrote:
>>> 
>>>> Lets start with something quick and simple, all standard Column Families…
>>>> 
>>>> Visitant CF
>>>> key: id 
>>>> column name: property name
>>>> column value: property value 
>>>> 
>>>> Visitant Sessions CF
>>>> key: visitant id 
>>>> column name: session id
>>>> column value: none
>>>> 
>>>> Session CF
>>>> 
>>>> key: session_id
>>>> column_name: property value 
>>>> column_value: property value 
>>>> 
>>>> key: session_id/requests
>>>> column_name: request_id
>>>> column_value: none
>>>> 
>>>> key: session_id/events
>>>> column_name: event_id
>>>> column_value: none
>>>> 
>>>> Requests CF
>>>> 
>>>> key: request_id
>>>> column_name: property name
>>>> column_value: property value
>>>> 
>>>> Event CF
>>>> 
>>>> key: event_id
>>>> column_name: property name
>>>> column_value: property value
>>>> 
>>>> 
>>>> Notes:
>>>> 
>>>> * assuming the Visitant CF is slowing changing i kept it in it's own cf.  
>>>> * using compound keys to keep information related to sessions in the same 
>>>> CF. These could be diff CF's,or in the Request or Event CF. 
>>>> * the best model is the one that allows you to do your reads by getting 
>>>> one or a few rows from a single cf. 
>>>> * you could collapse the Request and Event CF's into one. 
>>>> 
>>>> If the event and request data is immutable (or there is no issues with 
>>>> concurrent modifications) I would recommend this…
>>>> 
>>>> Request / Event CF:
>>>> 
>>>> key: session_id/events or session_id/requests
>>>> column_name: event_id or session_id
>>>> column_value: data
>>>> 
>>>> 
>>>> Start with the simple model and then make changes to better handle your 
>>>> read queries.
>>>> 
>>>> Have fun :)
>>>> 
>>>> 
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 22/08/2011, at 11:13 PM, Helder Oliveira wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> i have a SQL structure like this:
>>>>> 
>>>>> Visitant ( has several properties )
>>>>> Visitant has many Sessions
>>>>> Sessions ( has several properties )
>>>>> Sessions has many Requests ( has several properties )
>>>>> Sessions has many Events ( has several properties )
>>>>> 
>>>>> 
>>>>> i have read a lot and still confused how to put this on cassandra, can 
>>>>> someone give me a idea ?
>>>> 
>>> 
>> 
> 

Reply via email to