Hi All, sorry for taking so long to answer. I was away from the internet. >> Héctor, when you say "I have upgraded all my cluster to 0.8.1", from > >> which version was > >> that: 0.7.something or 0.8.0 ?
0.7.6-2 to 0.8.1 > This is the same behavior I reported in 2768 as Aaron referenced ... > > What was suggested for us was to do the following: > > > > - Shut down the entire ring > > - When you bring up each node, do a nodetool repair > > That's exactly what I ended up doing. Repair now works. I tried to do a rolling restart with 2818 applied, but it did not work. > However, in the issue reported, it was unable to be reproduced ... I'd > > be curious to know how Hector's keyspace is defined. Ours at the time > > was RF=3 and using Ec2 snitch... Nothing special, Default snithch, RF=3. I think this should be prioritized, as having to restart the whole cluster is a bit extreme. We don't have separate DCs, so I had to incurre on downtime, which costs money, and a little bit of grief. El vie, 01-07-2011 a las 10:16 +0200, Sylvain Lebresne escribió: > To make it clear what the problem is, this is not a repair problem. This is > a gossip problem. Gossip is reporting that the remote node is a 0.7 node > and repair is just saying "I cannot use that node because repair has changed > and the 0.7 node will not know how to answer me correctly", which is the > correct behavior if the node happens to be a 0.7 node. > > Hence, I'm kind of baffled that dropping a keyspace and recreating it fixed > anything. Unless as part of "removed the keyspace", you've deleted the > system tables, in which case that could have triggered something. > > -- > Sylvain > > On Fri, Jul 1, 2011 at 9:33 AM, Sasha Dolgy <sdo...@gmail.com> wrote: > > This is the same behavior I reported in 2768 as Aaron referenced ... > > What was suggested for us was to do the following: > > > > - Shut down the entire ring > > - When you bring up each node, do a nodetool repair > > > > That didn't immediately resolve the problems. In the end, I backed up > > all the data, removed the keyspace and created a new one. That seemed > > to have solved our problems. That was from 0.7.6-2 to 0.8.0 > > > > However, in the issue reported, it was unable to be reproduced ... I'd > > be curious to know how Hector's keyspace is defined. Ours at the time > > was RF=3 and using Ec2 snitch... > > > > -sd > > > > On Fri, Jul 1, 2011 at 9:22 AM, Sylvain Lebresne <sylv...@datastax.com> > > wrote: > >> Héctor, when you say "I have upgraded all my cluster to 0.8.1", from > >> which version was > >> that: 0.7.something or 0.8.0 ? > >> > >> If this was 0.8.0, did you run successful repair on 0.8.0 previous to > >> the upgrade ? > >