At the point that book was written (about a year ago it was finalized), vector clocks were planned. In August or September of last year, they were removed. 0.7 was released in January. The ticket for vector clocks is here and you can see the reasoning for not using them at the bottom. https://issues.apache.org/jira/browse/CASSANDRA-580
On Aug 24, 2011, at 12:41 PM, Kevin Burton wrote: > This is really interesting… I can track it down but there are a number of > references to Cassandra HAVING vector clocks … which would make sense that I > can't find out how much memory they are using :-P > > "Cassandra: The Definitive Guide" … which I was reading the other night says > that they were introduced in 0.7 but that they're still figuring out what to > do with them: > > http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false > > … so… are 'timestamps' pruned? > > Even this mechanism seems like it will dominate the amount of memory used in > Cassandra. I could see many installs requiring 2-3x more memory to run > Cassandra unless there is a pruning mechanism or some way to minimize their > use. > > Kevin > > > On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <r...@twitter.com> wrote: > On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bur...@spinn3r.com> wrote: > I had a thread going the other day about vector clock memory usage and that > it is a series of (clock id, clock):ts and the ability to prune old entries … > I'm specifically curious here how often old entries are pruned. > > If you're storing small columns within cassandra. Say just an integer. The > vector clock overhead could easily use up far more data than is actually in > your database. > > However, if they are pruned, then this shouldn't really be a problem. > > How much memory is this wasting? > > I think there is some confusion here– cassandra doesn't use vector clocks. > > -ryan > > Thoughts? > > > Jonathan Ellis jbel...@gmail.com to user > show details Aug 19 (4 days ago) > The problem with naive last write wins is that writes don't always > arrive at each replica in the same order. So no, that's a > non-starter. > > Vector clocks are a series of (client id, clock) entries, and usually > a timestamp so you can prune old entries. Obviously implementations > can vary, but to pick a specific example, Voldemort [1] uses 2 bytes > per client id, a variable number (at least one) of bytes for the > clock, and 8 bytes for the timestamp. > > [1] > https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java > > > -- > Founder/CEO Spinn3r.com > > Location: San Francisco, CA > Skype: burtonator > Skype-in: (415) 871-0687 > > > > > > -- > Founder/CEO Spinn3r.com > > Location: San Francisco, CA > Skype: burtonator > Skype-in: (415) 871-0687 >