http://twitter.com/jromeh/status/12295736793
-----Original Message----- From: "Mike Gallamore" <mike.e.gallam...@googlemail.com> Sent: Friday, April 16, 2010 3:46pm To: user@cassandra.apache.org Subject: Re: Regarding Cassandra Scalability Also people with 1M followers tend to have "public" tweets, which means really I think it would be the same as subscribing to a RSS feed or whatever. You aren't getting a local copy because you will "always" have access to the tweet as will everyone else. Also tweets don't change AFAIK so no point in having redundant copies. On 04/16/2010 01:42 PM, Peter Chang wrote: > Yeah. I wasn't sure if Cassandra was optimized for binary data > especially since any site of that size will use a CDN. Interesting > read though. > > I think 1K per tweet is off by an order of magnitude considering they > only allow 140 characters. Regardless the number of users with> 1MM > is probably a handful. Also im guessing they purge data after a > certain window (like 30 days for example). > > Sent from my iPhone > > > On Apr 16, 2010, at 12:02 PM, gabriele renzi<rff....@gmail.com> wrote: > > >> On Fri, Apr 16, 2010 at 6:41 PM, Peter Chang<pete...@gmail.com> >> wrote: >> >>> FB also does pics and movies so 1MB is way off depending on where >>> they >>> manage such binary data. >>> >> apparently not in cassandra >> http://www.facebook.com/note.php?note_id=76191543919 >> >> >>> I do agree that 1MB of text alone is a lot of text >>> which is more relevant in the case of Twitter. The only large thing >>> you >>> leave out is denormalization. Every tweet you write is likely >>> denormalized >>> across your followers to allow for quick read access. >>> >> .. but considering many users have _millions_ of followers, this may >> be quite a bit more data. Assuming 1k per tweet, this would mean one >> from @aplusk (4.7M followers) would take more than 4 gigabytes of >> data. Assuming ten tweets a day, in one month he'd produce one TB. >> >> I'd say they only store references (increasing number lists can also >> be encoded very cleverly), or in some other way I'm not smart enough >> to think of. >>