>The book is wrong, at least by current versions of Cassandra (I'm >basing that on the quote you pasted, I don't know the context).
To be sure that I didn't misunderstand (English is not my mother tongue) here is what the entire "repair paragraph" says ... Basic Maintenance There are a few tasks that you’ll need to perform before or after more impactful tasks. For example, it makes sense to take a snapshot only after you’ve performed a flush. So in this section we look at some of these basic maintenance tasks: repair, snapshot, and cleanup. Repair Running nodetool repair causes Cassandra to execute a major compaction. A Merkle tree of the data on the target node is computed, and the Merkle tree is compared with those of other replicas. This step makes sure that any data that might be out of sync with other nodes isn’t forgotten. During a major compaction (see “Compaction” in the Glossary), the server initiates a TreeRequest/TreeReponse conversation to exchange Merkle trees with neighboring nodes. The Merkle tree is a hash representing the data in that column family. If the trees from the different nodes don’t match, they have to be reconciled (or “repaired”) in order to determine the latest data values they should all be set to. This tree compar- ison validation is the responsibility of the org.apache.cassandra.service. AntiEntropy Service class. AntiEntropyService implements the Singleton pattern and defines the static Differencer class as well, which is used to compare two trees. If it finds any differences, it launches a repair for the ranges that don’t agree. So although Cassandra takes care of such matters automatically on occasion, you can run it yourself as well. > >nodetool repair must be scheduled by the operator to run regularly. >The name "repair" is a bit unfortunate; it is not meant to imply that >it only needs to run when something is "wrong". > >-- >/ Peter Schuller >