Hi again. Did you receive my mail ? It's the first time I use this mailing list.
If you received it, did anybody face this problem ? It looks like this subject is going to be discussed at Cassandra NYC meeting. http://www.datastax.com/2011/11/joe-stein-of-medialets-to-speak-at-cassandra-nyc Any idea of what they are going to say about this subject or have I to wait ? Will the video record of this conference be public ? thanks, Alain 2011/11/4 Alain RODRIGUEZ <arodr...@gmail.com> > Hi all, > > I started this thread in the phpCassa google group, but I thinks its place > is here. > > There is my first post : > > "I was wondering about a specific point of Cassandra Modeling. > > If I need to know the number of connexion to my website using each > browser, every hour, I can do: > > Row key: $browser, column key: date('YmdH', $timestamp), value: counter. > > I can increment this counter for any visit, this should work. The point is > that I want to be able to render the results of a lot of statistics used as > filters. > > I mean, I will have information such as browser, browser version, screen > resolution, OS, OS version, localization... And I want to allow users to > get data (number of views) filtering it as much as they want. > > For example, if I want to know how many people visited my website with > safari, windos, and from New York, every hour, I can store: > > Row key : $browser:$os:$localization, column key : date('YmdH', > $timestamp), value : counter. > > This can't be the best solution because according to the combinational > mathematics I will have to store n! counters to be able to store data with > all filters. If I got 10 filters I will increment 3 628 800 counters. > > That's not the good solution, for sure. How am I supposed to model this to > be able to read data with any filter I want ? > > Thanks, > > Alain" > > > > And there is the first answer given (thanks to Tyler Hobbs) : > > "Technically, the number of potential different counters would be the > cardinality of each field multiplied together. (Since one of the fields > holds a time, this number would continue to grow.) However, in practice > you'll have far fewer than this number of counters, because not every > possible combination of these will happen. > > >That's not the good solution, for sure. How am I supposed to model > > > this to be able to read data with any filter I want ? > > It's a reasonable solution if you want to be able to drill down and filter > by any attribute. If you want to be able to filter based on all of these > attributes, you have to store that information about every request in one > way or another." > > > > I know it's a non-trivial problem, but I'm sure that some people already > faced this problem before I do. > > I'll allow user to filter however they want, chosing dimensions with > checkboxes. They will be able to combine dimensions and ask for any > combination. > > So, with this solution, I will have to store every event n times, with n = > number of possible combinations. > > I saw this yesterday : http://t.co/EXL6yAO8 (thanks to Dave Gardner). > This company seems to something equivalent of the idea exposed in my first > post.... > > Any experience to share with this kind of problem ? > > thank you, > > Alain > >