Thanks aaron, I already paid attention to these slides and I just looked at
them again.

I'm still in the dark about how to get the number of unique visitors
between 2 dates (randomly chosen, because chosen by user) efficiently.

I could easily count them per hour, day, week, month... But it's a bit
harder to give this statistic between 2 unknown dates as explained at the
start of this thread.

Am I missing any clue in these slides ?

2012/1/19 aaron morton <aa...@thelastpickle.com>

> Some tips here from Matt Dennis on how to model time series data
> http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling
>
> Cheers
>  -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/01/2012, at 10:30 PM, Alain RODRIGUEZ wrote:
>
> Hi thanks for your answer but I don't want to add more layer on top of
> Cassandra. I also have done all of my application without Countandra and I
> would like to continue this way.
>
> Furthermore there is a Cassandra modeling problem that I would like to
> solve, and not just hide.
>
> Alain
>
> 2012/1/18 Lucas de Souza Santos <lucas...@gmail.com>
>
>> Why not http://www.countandra.org/
>>
>>
>> Lucas de Souza Santos (ldss)
>>
>>
>>
>> On Wed, Jan 18, 2012 at 3:23 PM, Alain RODRIGUEZ <arodr...@gmail.com>wrote:
>>
>>> I'm wondering how to modelize my CFs to store the number of unique
>>> visitors in a time period in order to be able to request it fast.
>>>
>>> I thought of sharding them by day (row = 20120118, column = visitor_id,
>>> value = '') and perform a getcount. This would work to get unique visitors
>>> per day, per week or per month but it wouldn't work if I want to get unique
>>> visitors between 2 specific dates because 2 rows can share the same
>>> visitors (same columns). I can have 1500 unique visitors today, 1000 unique
>>> visitors yesterday but only 2000 new visitors when aggregating these days.
>>>
>>> I could get all the columns for this 2 rows and perform an intersect
>>> with my client language but performance won't be good with big data.
>>>
>>> Has someone already thought about this modelization ?
>>>
>>> Thanks for your help ;)
>>>
>>> Alain
>>>
>>
>>
>
>

Reply via email to