> I'm very fresh to Cassandra and just read some relevant documentations. > It seems each time when a client wants to insert data to Cassandra cluster, > the client also need to assign a timestamp. Then Cassandra will keep the > timestamp and it will be used to determine which copy is the latest and > should be returned based on CL level when client issues a query, right?
Yes (the same reconciliation logic is also used on e.g. anti-entropy (nodetool repair)). > My question is if we have many clients, should all the clients be time > synchronized? Is it the clients responsibility? If the clients does not time > synchronized, the Cassandra might returned wrong row? Yes, yes, sort of. If clients are out of synch w.r.t time, the wrong version of the data may end up getting stored by Cassandra. However, the clock on the client doing the *read* does not affect what it sees. Note however that the need for clock synchronization is often less of a problem than it might first appear; if you have strong synchronization requirements such that you cannot afford to have races at all, you will need a separate synchronization mechanism anyway (or change the data model to handle it). Clocks should be synchronized yes, but the no matter how well the clocks are synchronized that alone will never give you the measurable ability to control "who wins" in the event of a race. (I don't know about plans for vector clock support; I haven't heard much about it lately. I'll let someone else respond to that.) -- / Peter Schuller