On Fri, Jul 1, 2011 at 3:16 AM, Sylvain Lebresne <sylv...@datastax.com> wrote: > To make it clear what the problem is, this is not a repair problem. This is > a gossip problem. Gossip is reporting that the remote node is a 0.7 node > and repair is just saying "I cannot use that node because repair has changed > and the 0.7 node will not know how to answer me correctly", which is the > correct behavior if the node happens to be a 0.7 node.
Technically, this is not part of gossip (in that no state is being gossiped for this, but we do maintain this state in the Gossiper class), but your analysis of the problem is correct. The problem is that on an upgrade via rolling restart, the existing nodes still remember the new ones as being old, so they mimic the old version, thusly propagating the old version around. > Hence, I'm kind of baffled that dropping a keyspace and recreating it fixed > anything. Unless as part of "removed the keyspace", you've deleted the > system tables, in which case that could have triggered something. I don't see how this could help either, since the version is bound in Gossiper and set by IncomingTcpConnection. I've created https://issues.apache.org/jira/browse/CASSANDRA-2860 to get this resolved. --Brandon