On Thu, Mar 18, 2010 at 4:08 PM, Muhammed Nasrullah <nasrul...@gmail.com>wrote:
> Hello folks, > > Twissandra <http://twissandra.com/> (Twitter clone example for Cassandra) > has a public page where every public update/tweet is stored in a column > family under the key !public! like so: > > Userline = { > '!public!': { > # timestamp of tweet: tweet id > 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', > 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', > 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', > 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', > }, > } > > > My question is, because this is the public timeline, it will get a lot of > updates and because this is a single row keyed by '!public!', this won't fit > in memory eventually. Is there a better way to model this? The problem is > that the data needs to be retrieved in reverse chronological order, > something which cannot be done while getting a range of keys without knowing > the start and finish keys in advance. The rows could be named and partitioned by date/time, which can be known in advance. For example, '!public!20100318' could contain the public timeline for that day. -Brandon