On Fri, Apr 16, 2010 at 6:41 PM, Peter Chang <pete...@gmail.com> wrote: > FB also does pics and movies so 1MB is way off depending on where they > manage such binary data.
apparently not in cassandra http://www.facebook.com/note.php?note_id=76191543919 >I do agree that 1MB of text alone is a lot of text > which is more relevant in the case of Twitter. The only large thing you > leave out is denormalization. Every tweet you write is likely denormalized > across your followers to allow for quick read access. .. but considering many users have _millions_ of followers, this may be quite a bit more data. Assuming 1k per tweet, this would mean one from @aplusk (4.7M followers) would take more than 4 gigabytes of data. Assuming ten tweets a day, in one month he'd produce one TB. I'd say they only store references (increasing number lists can also be encoded very cleverly), or in some other way I'm not smart enough to think of.