> later i will need to apply filter to that data, Sounds like a read query you should support by denormalising the data.
Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 25/08/2011, at 10:50 PM, Helder Oliveira wrote: > Hello, > > thanks for your time. > > I have suggested a SCF but i am still testing the system with CF, making some > tests and testing the data flow ( insert / select ). > > Making subdata as JSON already came into my mind, but it's not possible > because later i will need to apply filter to that data, and if it is in JSON > i need to fetch all and filter on the programming side. Correct me if i am > wrong. > > Well i will continue the tests with CF, things are getting more clear for me > now. > > Thanks a lot guys for answer and spending time with some newbie questions :) > > > On Aug 24, 2011, at 11:27 PM, aaron morton wrote: > >> I normally suggest trying a model with Standard CF's first as there are some >> down sides to super CF's. If you know there will only be a few sub columns >> there are probably OK (see >> http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative >> design is fine. Test it out and see what works for you. >> >> Also (and I know not everyone agrees) depending on the use case it's ok to >> blob data up. Cassandra does not *need* to know about the individual >> properties of your entities. By that I mean there is not a query planner >> that can make better decisions about how to execute your query based on data >> types and distributions, or how what types columns should have in >> projections. >> >> So an alternative here is to collapse VisitantSessions and Sessions into >> one, and store the session data as a JSON (or similar) blob in the column >> value. This works best if you do not need to concurrently update fields in >> the entity. So if you write the session data once, or if you *always* only >> update from a single thread / process. Or if your data is designed to be >> overwritten. >> >> Cheers >> >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 25/08/2011, at 1:15 AM, Helder Oliveira wrote: >> >>> Thanks Indranath Ghosh for your tip! >>> >>> I will continue here the question. >>> >>> Aaron, i have read your suggestion and tried to design your suggestion and >>> i have one question regarding it. >>> >>> Let's forget for now the Requests and Events! >>> >>> Just keep the Visitants and the Sessions. >>> >>> My goal is when having a visitant get all informations about him, in this >>> case all his past sessions so i can create a profile using his past. >>> >>> You have suggested 3 CF's >>> >>> Visitants CF >>> key: id >>> cn: pn >>> cv: pv >>> >>> Visitant Sessions CF >>> key: visitant id >>> cn: session id >>> cv: none >>> >>> Sessions CF >>> key: session id >>> cn: pn >>> cv: pv >>> >>> When i need to know everything about one visitant, i need to query >>> "Visitant Sessions CF" to get all keys, and then query "Sessions CF" for >>> all keys properties. >>> >>> In this case, applying a Super Column Family to the "Sessions" isn't better >>> ? >>> >>> I mean something like: >>> >>> { >>> "sessions": { >>> >>> "visitant id 1": { >>> "session id 1": { >>> "p1": {"p1": "v1"}, >>> "p2": {"jira": "v2"} >>> }, >>> "session id 2": { >>> "p1": {"p1": "v1"} >>> } >>> } >>> >>> "visitant id 2": { >>> "session id 3": { >>> "p1": {"p1": "v1"}, >>> "p2": {"jira": "v2"} >>> }, >>> "session id 4": { >>> "p1": {"p1": "v1"} >>> } >>> } >>> } >>> } >>> >>> Using this, i can get all sessions in the second query, instead of having >>> all sessions only at third query. >>> >>> Regarding your notes, the Visitant CF will be almost unchangeable since the >>> beginning of his creation, the sessions will be added every time a known >>> user visits back, ceasing a new sessions. >>> >>> Thanks a lot for you help guys, and i hope i was not saying crazy things :D >>> >>> On Aug 22, 2011, at 11:23 PM, aaron morton wrote: >>> >>>> Lets start with something quick and simple, all standard Column Families… >>>> >>>> Visitant CF >>>> key: id >>>> column name: property name >>>> column value: property value >>>> >>>> Visitant Sessions CF >>>> key: visitant id >>>> column name: session id >>>> column value: none >>>> >>>> Session CF >>>> >>>> key: session_id >>>> column_name: property value >>>> column_value: property value >>>> >>>> key: session_id/requests >>>> column_name: request_id >>>> column_value: none >>>> >>>> key: session_id/events >>>> column_name: event_id >>>> column_value: none >>>> >>>> Requests CF >>>> >>>> key: request_id >>>> column_name: property name >>>> column_value: property value >>>> >>>> Event CF >>>> >>>> key: event_id >>>> column_name: property name >>>> column_value: property value >>>> >>>> >>>> Notes: >>>> >>>> * assuming the Visitant CF is slowing changing i kept it in it's own cf. >>>> * using compound keys to keep information related to sessions in the same >>>> CF. These could be diff CF's,or in the Request or Event CF. >>>> * the best model is the one that allows you to do your reads by getting >>>> one or a few rows from a single cf. >>>> * you could collapse the Request and Event CF's into one. >>>> >>>> If the event and request data is immutable (or there is no issues with >>>> concurrent modifications) I would recommend this… >>>> >>>> Request / Event CF: >>>> >>>> key: session_id/events or session_id/requests >>>> column_name: event_id or session_id >>>> column_value: data >>>> >>>> >>>> Start with the simple model and then make changes to better handle your >>>> read queries. >>>> >>>> Have fun :) >>>> >>>> >>>> >>>> ----------------- >>>> Aaron Morton >>>> Freelance Cassandra Developer >>>> @aaronmorton >>>> http://www.thelastpickle.com >>>> >>>> On 22/08/2011, at 11:13 PM, Helder Oliveira wrote: >>>> >>>>> Hello all, >>>>> >>>>> i have a SQL structure like this: >>>>> >>>>> Visitant ( has several properties ) >>>>> Visitant has many Sessions >>>>> Sessions ( has several properties ) >>>>> Sessions has many Requests ( has several properties ) >>>>> Sessions has many Events ( has several properties ) >>>>> >>>>> >>>>> i have read a lot and still confused how to put this on cassandra, can >>>>> someone give me a idea ? >>>> >>> >> >