Hello, thanks for your time.
I have suggested a SCF but i am still testing the system with CF, making some tests and testing the data flow ( insert / select ). Making subdata as JSON already came into my mind, but it's not possible because later i will need to apply filter to that data, and if it is in JSON i need to fetch all and filter on the programming side. Correct me if i am wrong. Well i will continue the tests with CF, things are getting more clear for me now. Thanks a lot guys for answer and spending time with some newbie questions :) On Aug 24, 2011, at 11:27 PM, aaron morton wrote: > I normally suggest trying a model with Standard CF's first as there are some > down sides to super CF's. If you know there will only be a few sub columns > there are probably OK (see > http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative > design is fine. Test it out and see what works for you. > > Also (and I know not everyone agrees) depending on the use case it's ok to > blob data up. Cassandra does not *need* to know about the individual > properties of your entities. By that I mean there is not a query planner that > can make better decisions about how to execute your query based on data types > and distributions, or how what types columns should have in projections. > > So an alternative here is to collapse VisitantSessions and Sessions into one, > and store the session data as a JSON (or similar) blob in the column value. > This works best if you do not need to concurrently update fields in the > entity. So if you write the session data once, or if you *always* only update > from a single thread / process. Or if your data is designed to be > overwritten. > > Cheers > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 25/08/2011, at 1:15 AM, Helder Oliveira wrote: > >> Thanks Indranath Ghosh for your tip! >> >> I will continue here the question. >> >> Aaron, i have read your suggestion and tried to design your suggestion and i >> have one question regarding it. >> >> Let's forget for now the Requests and Events! >> >> Just keep the Visitants and the Sessions. >> >> My goal is when having a visitant get all informations about him, in this >> case all his past sessions so i can create a profile using his past. >> >> You have suggested 3 CF's >> >> Visitants CF >> key: id >> cn: pn >> cv: pv >> >> Visitant Sessions CF >> key: visitant id >> cn: session id >> cv: none >> >> Sessions CF >> key: session id >> cn: pn >> cv: pv >> >> When i need to know everything about one visitant, i need to query "Visitant >> Sessions CF" to get all keys, and then query "Sessions CF" for all keys >> properties. >> >> In this case, applying a Super Column Family to the "Sessions" isn't better ? >> >> I mean something like: >> >> { >> "sessions": { >> >> "visitant id 1": { >> "session id 1": { >> "p1": {"p1": "v1"}, >> "p2": {"jira": "v2"} >> }, >> "session id 2": { >> "p1": {"p1": "v1"} >> } >> } >> >> "visitant id 2": { >> "session id 3": { >> "p1": {"p1": "v1"}, >> "p2": {"jira": "v2"} >> }, >> "session id 4": { >> "p1": {"p1": "v1"} >> } >> } >> } >> } >> >> Using this, i can get all sessions in the second query, instead of having >> all sessions only at third query. >> >> Regarding your notes, the Visitant CF will be almost unchangeable since the >> beginning of his creation, the sessions will be added every time a known >> user visits back, ceasing a new sessions. >> >> Thanks a lot for you help guys, and i hope i was not saying crazy things :D >> >> On Aug 22, 2011, at 11:23 PM, aaron morton wrote: >> >>> Lets start with something quick and simple, all standard Column Families… >>> >>> Visitant CF >>> key: id >>> column name: property name >>> column value: property value >>> >>> Visitant Sessions CF >>> key: visitant id >>> column name: session id >>> column value: none >>> >>> Session CF >>> >>> key: session_id >>> column_name: property value >>> column_value: property value >>> >>> key: session_id/requests >>> column_name: request_id >>> column_value: none >>> >>> key: session_id/events >>> column_name: event_id >>> column_value: none >>> >>> Requests CF >>> >>> key: request_id >>> column_name: property name >>> column_value: property value >>> >>> Event CF >>> >>> key: event_id >>> column_name: property name >>> column_value: property value >>> >>> >>> Notes: >>> >>> * assuming the Visitant CF is slowing changing i kept it in it's own cf. >>> * using compound keys to keep information related to sessions in the same >>> CF. These could be diff CF's,or in the Request or Event CF. >>> * the best model is the one that allows you to do your reads by getting one >>> or a few rows from a single cf. >>> * you could collapse the Request and Event CF's into one. >>> >>> If the event and request data is immutable (or there is no issues with >>> concurrent modifications) I would recommend this… >>> >>> Request / Event CF: >>> >>> key: session_id/events or session_id/requests >>> column_name: event_id or session_id >>> column_value: data >>> >>> >>> Start with the simple model and then make changes to better handle your >>> read queries. >>> >>> Have fun :) >>> >>> >>> >>> ----------------- >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 22/08/2011, at 11:13 PM, Helder Oliveira wrote: >>> >>>> Hello all, >>>> >>>> i have a SQL structure like this: >>>> >>>> Visitant ( has several properties ) >>>> Visitant has many Sessions >>>> Sessions ( has several properties ) >>>> Sessions has many Requests ( has several properties ) >>>> Sessions has many Events ( has several properties ) >>>> >>>> >>>> i have read a lot and still confused how to put this on cassandra, can >>>> someone give me a idea ? >>> >> >