This is really interesting… I can track it down but there are a number of references to Cassandra HAVING vector clocks … which would make sense that I can't find out how much memory they are using :-P
"Cassandra: The Definitive Guide" … which I was reading the other night says that they were introduced in 0.7 but that they're still figuring out what to do with them: http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false … so… are 'timestamps' pruned? Even this mechanism seems like it will dominate the amount of memory used in Cassandra. I could see many installs requiring 2-3x more memory to run Cassandra unless there is a pruning mechanism or some way to minimize their use. Kevin On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <r...@twitter.com> wrote: > On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bur...@spinn3r.com> wrote: > >> I had a thread going the other day about vector clock memory usage and >> that it is a series of (clock id, clock):ts and the ability to prune old >> entries … I'm specifically curious here how often old entries are pruned. >> >> If you're storing small columns within cassandra. Say just an integer. >> The vector clock overhead could easily use up far more data than is >> actually in your database. >> >> However, if they are pruned, then this shouldn't really be a problem. >> >> How much memory is this wasting? >> > > I think there is some confusion here– cassandra doesn't use vector clocks. > > -ryan > > >> Thoughts? >> >> >> Jonathan Ellis jbel...@gmail.com to user >> show details Aug 19 (4 days ago) >> The problem with naive last write wins is that writes don't always >> arrive at each replica in the same order. So no, that's a >> non-starter. >> >> Vector clocks are a series of (client id, clock) entries, and usually >> a timestamp so you can prune old entries. Obviously implementations >> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes >> per client id, a variable number (at least one) of bytes for the >> clock, and 8 bytes for the timestamp. >> >> [1] >> https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java >> >> >> -- >> >> Founder/CEO Spinn3r.com >> >> Location: *San Francisco, CA* >> Skype: *burtonator* >> >> Skype-in: *(415) 871-0687* >> >> > -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* Skype-in: *(415) 871-0687*