Thanks aaron, I already paid attention to these slides and I just looked at them again.
I'm still in the dark about how to get the number of unique visitors between 2 dates (randomly chosen, because chosen by user) efficiently. I could easily count them per hour, day, week, month... But it's a bit harder to give this statistic between 2 unknown dates as explained at the start of this thread. Am I missing any clue in these slides ? 2012/1/19 aaron morton <aa...@thelastpickle.com> > Some tips here from Matt Dennis on how to model time series data > http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling > > Cheers > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 19/01/2012, at 10:30 PM, Alain RODRIGUEZ wrote: > > Hi thanks for your answer but I don't want to add more layer on top of > Cassandra. I also have done all of my application without Countandra and I > would like to continue this way. > > Furthermore there is a Cassandra modeling problem that I would like to > solve, and not just hide. > > Alain > > 2012/1/18 Lucas de Souza Santos <lucas...@gmail.com> > >> Why not http://www.countandra.org/ >> >> >> Lucas de Souza Santos (ldss) >> >> >> >> On Wed, Jan 18, 2012 at 3:23 PM, Alain RODRIGUEZ <arodr...@gmail.com>wrote: >> >>> I'm wondering how to modelize my CFs to store the number of unique >>> visitors in a time period in order to be able to request it fast. >>> >>> I thought of sharding them by day (row = 20120118, column = visitor_id, >>> value = '') and perform a getcount. This would work to get unique visitors >>> per day, per week or per month but it wouldn't work if I want to get unique >>> visitors between 2 specific dates because 2 rows can share the same >>> visitors (same columns). I can have 1500 unique visitors today, 1000 unique >>> visitors yesterday but only 2000 new visitors when aggregating these days. >>> >>> I could get all the columns for this 2 rows and perform an intersect >>> with my client language but performance won't be good with big data. >>> >>> Has someone already thought about this modelization ? >>> >>> Thanks for your help ;) >>> >>> Alain >>> >> >> > >