Sorry I made a mistake in topics-seen ! When you insert it should be : topics-seen[topic:TopicX:timestampN]={TimeUUID3:whatever}
Sorry about that, Victor 2011/5/18 openvictor Open <openvic...@gmail.com> > I guess you can use the same system, you need two CF for that and I think > it's better to use 0.8 because it supports counter : > > One CF with UTF8Type called active-topics one CF with UUIDType called > topics-seen, then using the same principle : > > for each timestampN you create : > > For each visit to Topic1 Topic2 Topic1 > > You create a TimeUUID and you insert > active-topics[topics:timestampN] = {Topic1:whateveryouwant} > and : > topics-seen[topic:Topic1]={TimeUUID1:whatever} > > > active-topics[topics:timestampN] = {Topic2:whateveryouwant} > and : > topics-seen[topic:Topic2]={TimeUUID2:whatever} > > > active-topics[topics:timestampN] = {Topic1:whateveryouwant} > and : > topics-seen[topic:Topic1]={TimeUUID3:whatever} > > > Then when you want to query, you query first all the topics (slice) in > active-topics for topics:timestampN and then you get all counts in the > topics-seen CF for all topics in active-topics. > > Not so simple... By the way it adds overhead compared to a simple counter > solution but I think it is far more elegant, but this is just my opinion. > > > Victor > > > 2011/5/18 Aditya Narayan <ady...@gmail.com> > >> Thanks victor! >> >> Aren't there any good ways by using Cassandra alone ? >> >> >> On Wed, May 18, 2011 at 11:41 PM, openvictor Open >> <openvic...@gmail.com>wrote: >> >>> Have you thought about user another kind of Database, which supports >>> volative content for example ? >>> >>> I am currently thinking about doing something similar. The best and >>> simplest option at the moment that I can think of is Redis. In redis you >>> have the option of querying keys with wildcards. Your problem can be done by >>> just inserting an UUID into Redis for a certain amount of time ( the best is >>> to tailor this amount of time as an inverse function of the number of keys >>> existing in Redis). >>> >>> *With Redis* >>> What I would do : I cut down time in pieces of X minutes ( 15 minutes, >>> for example by truncating a timestamp). Let timestampN be the timestamp for >>> the period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then : >>> >>> One or more people will view Topic 1 then Topic2 then again Topic1 in >>> this period of 15 minutes >>> (HINCRBY is the Increment) >>> H >>> <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >>> topics:Topic1:timestampN >>> viewcount 1 >>> H >>> <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >>> topics:Topic2:timestampN >>> viewcount 1 >>> H >>> <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >>> topics:Topic1:timestampN >>> viewcount 1 >>> >>> Then you just query in the following way : >>> >>> MGET <http://redis.io/commands/mget> topics:*:timestampN >>> >>> * is the wildcard, you order by viewcount and you have what you are >>> asking for ! >>> This is a simplified version of what you should do but personnally I >>> really like the combination of Cassandra and Redis. >>> >>> >>> Victor >>> >>> 2011/5/18 Aditya Narayan <ady...@gmail.com> >>> >>>> I would arrange for memtable flush period in such a manner that the time >>>> period for which these most viewed discussions are generated equals the >>>> memtable flush timeperiod, so that the entire row of most viewed discussion >>>> on a topic is in one or maximum two memtables/ SST tables. >>>> This would also help minimize several versions of the same column in the >>>> row parts in different SST tables. >>>> >>>> >>>> >>>> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan <ady...@gmail.com>wrote: >>>> >>>>> ************* >>>>> For a discussions forum, I need to show a page of most viewed >>>>> discussions. >>>>> >>>>> For implementing this, I maintain a count of views of a discussion & >>>>> when this views count of a discussion passes a certain threshold limit, >>>>> the >>>>> discussion Id is added to a row of most viewed discussions. >>>>> >>>>> This row of most viewed discussions contains columns with Integer names >>>>> & values containing serialized lists of Ids of all discussions whose views >>>>> count equals the Integral name of this column. >>>>> >>>>> Thus if the view count of a discussion increases I'll need to move its >>>>> 'Id' from serialized list in some column to serialized list in another >>>>> column whose name represents the updated views count on that discussion. >>>>> >>>>> Thus I can get the most viewed discussions by getting the appropriate >>>>> no of columns from one end of this Integer sorted row. >>>>> >>>>> ************ >>>>> >>>>> I wanted to get feedback from you all, to know if this is a good >>>>> design. >>>>> >>>>> Thanks >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >