I'm wondering how to modelize my CFs to store the number of unique visitors
in a time period in order to be able to request it fast.

I thought of sharding them by day (row = 20120118, column = visitor_id,
value = '') and perform a getcount. This would work to get unique visitors
per day, per week or per month but it wouldn't work if I want to get unique
visitors between 2 specific dates because 2 rows can share the same
visitors (same columns). I can have 1500 unique visitors today, 1000 unique
visitors yesterday but only 2000 new visitors when aggregating these days.

I could get all the columns for this 2 rows and perform an intersect with
my client language but performance won't be good with big data.

Has someone already thought about this modelization ?

Thanks for your help ;)

Alain

Reply via email to