Hello,

thanks for your time.

I have suggested a SCF but i am still testing the system with CF, making some 
tests and testing the data flow ( insert / select ).

Making subdata as JSON already came into my mind, but it's not possible because 
later i will need to apply filter to that data, and if it is in JSON i need to 
fetch all and filter on the programming side. Correct me if i am wrong.

Well i will continue the tests with CF, things are getting more clear for me 
now.

Thanks a lot guys for answer and spending time with some newbie questions :)


On Aug 24, 2011, at 11:27 PM, aaron morton wrote:

> I normally suggest trying a model with Standard CF's first as there are some 
> down sides to super CF's. If you know there will only be a few sub columns 
> there are probably OK (see 
> http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative 
> design is fine. Test it out and see what works for you. 
> 
> Also (and I know not everyone agrees) depending on the use case it's ok to 
> blob data up. Cassandra does not *need* to know about the individual 
> properties of your entities. By that I mean there is not a query planner that 
> can make better decisions about how to execute your query based on data types 
> and distributions, or how what types columns should have in projections. 
> 
> So an alternative here is to collapse VisitantSessions and Sessions into one, 
> and store the session data as a JSON (or similar) blob in the column value. 
> This works best if you do not need to concurrently update fields in the 
> entity. So if you write the session data once, or if you *always* only update 
> from a single thread / process. Or if your data is designed to be 
> overwritten. 
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 25/08/2011, at 1:15 AM, Helder Oliveira wrote:
> 
>> Thanks Indranath Ghosh for your tip!
>> 
>> I will continue here the question.
>> 
>> Aaron, i have read your suggestion and tried to design your suggestion and i 
>> have one question regarding it.
>> 
>> Let's forget for now the Requests and Events!
>> 
>> Just keep the Visitants and the Sessions.
>> 
>> My goal is when having a visitant get all informations about him, in this 
>> case all his past sessions so i can create a profile using his past.
>> 
>> You have suggested 3 CF's
>> 
>> Visitants CF
>> key: id
>> cn: pn
>> cv: pv
>> 
>> Visitant Sessions CF
>> key: visitant id
>> cn: session id
>> cv: none
>> 
>> Sessions CF
>> key: session id
>> cn: pn
>> cv: pv
>> 
>> When i need to know everything about one visitant, i need to query "Visitant 
>> Sessions CF" to get all keys, and then query "Sessions CF" for all keys 
>> properties.
>> 
>> In this case, applying a Super Column Family to the "Sessions" isn't better ?
>> 
>> I mean something like:
>> 
>> {
>>   "sessions": {
>> 
>>     "visitant id 1": {
>>       "session id 1": {
>>         "p1": {"p1": "v1"},
>>         "p2": {"jira": "v2"}
>>       },
>>       "session id 2": {
>>         "p1": {"p1": "v1"}
>>       }
>>     }
>>      
>>     "visitant id 2": {
>>       "session id 3": {
>>         "p1": {"p1": "v1"},
>>         "p2": {"jira": "v2"}
>>       },
>>       "session id 4": {
>>         "p1": {"p1": "v1"}
>>       }
>>     }
>>   }
>> }
>> 
>> Using this, i can get all sessions in the second query, instead of having 
>> all sessions only at third query.
>> 
>> Regarding your notes, the Visitant CF will be almost unchangeable since the 
>> beginning of his creation, the sessions will be added every time a known 
>> user visits back, ceasing a new sessions.
>> 
>> Thanks a lot for you help guys, and i hope i was not saying crazy things :D
>> 
>> On Aug 22, 2011, at 11:23 PM, aaron morton wrote:
>> 
>>> Lets start with something quick and simple, all standard Column Families…
>>> 
>>> Visitant CF
>>> key: id 
>>> column name: property name
>>> column value: property value 
>>> 
>>> Visitant Sessions CF
>>> key: visitant id 
>>> column name: session id
>>> column value: none
>>> 
>>> Session CF
>>> 
>>> key: session_id
>>> column_name: property value 
>>> column_value: property value 
>>> 
>>> key: session_id/requests
>>> column_name: request_id
>>> column_value: none
>>> 
>>> key: session_id/events
>>> column_name: event_id
>>> column_value: none
>>> 
>>> Requests CF
>>> 
>>> key: request_id
>>> column_name: property name
>>> column_value: property value
>>> 
>>> Event CF
>>> 
>>> key: event_id
>>> column_name: property name
>>> column_value: property value
>>> 
>>> 
>>> Notes:
>>> 
>>> * assuming the Visitant CF is slowing changing i kept it in it's own cf.  
>>> * using compound keys to keep information related to sessions in the same 
>>> CF. These could be diff CF's,or in the Request or Event CF. 
>>> * the best model is the one that allows you to do your reads by getting one 
>>> or a few rows from a single cf. 
>>> * you could collapse the Request and Event CF's into one. 
>>> 
>>> If the event and request data is immutable (or there is no issues with 
>>> concurrent modifications) I would recommend this…
>>> 
>>> Request / Event CF:
>>> 
>>> key: session_id/events or session_id/requests
>>> column_name: event_id or session_id
>>> column_value: data
>>> 
>>> 
>>> Start with the simple model and then make changes to better handle your 
>>> read queries.
>>> 
>>> Have fun :)
>>> 
>>> 
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 22/08/2011, at 11:13 PM, Helder Oliveira wrote:
>>> 
>>>> Hello all,
>>>> 
>>>> i have a SQL structure like this:
>>>> 
>>>> Visitant ( has several properties )
>>>> Visitant has many Sessions
>>>> Sessions ( has several properties )
>>>> Sessions has many Requests ( has several properties )
>>>> Sessions has many Events ( has several properties )
>>>> 
>>>> 
>>>> i have read a lot and still confused how to put this on cassandra, can 
>>>> someone give me a idea ?
>>> 
>> 
> 

Reply via email to