I have an update on this. I witnessed this same split ring problem, this time while doing a rolling upgrade from 1.1.4 to 1.1.6. I found an easier workaround than modifying configs and restarting. I found that by explicitly specifying the same token on the commandline using "-Dcassandra.replace_token=" when bringing up the new node, this problem wasn't exhibited. Everything worked smoothly.
Ron On Oct 10, 2012, at 12:38 PM, Ron Siemens wrote: > > I witnessed the same behavior as reported by Edward and James. > > Removing the host from its own seed list does not solve the problem. > Removing it from config of all nodes and restarting each, then restarting the > failed node worked. > > Ron > > On Sep 12, 2012, at 4:42 PM, Edward Sargisson wrote: > >> I'm reposting my colleague's reply to Rob to the list (with James' >> permission) in case others are interested. >> >> I'll add to James' post below to say I don't believe we saw the message that >> that slice of code would have printed. >> >> " >> Hey Rob, >> >> Ed's AWOL right now and I'm not on u@c.a.o, but I can tell you that when >> I removed the downed seed node from its own list of seed nodes in >> cassandra.yaml that it didn't join the existing ring nor did it get any >> schemas or data from the existing ring; it felt like timeouts were >> happening. (IANA Cassandra wizard, so excuse my terminology impedance.) >> >> Changing the machine's hostname and giving it a new IP, it behaved as >> expected; joining the ring, syncing both schema and associated data. >> >> Downed node is 1.1.4, the rest of the ring is 1.1.2. >> >> I'm in a situation where I can revert the IP/hostname change and retry >> the scenario as needed if you've got any ideas. >> >> HTH, >> >> JAmes" >> >> Cheers, >> Edward >> >> On 12-09-12 03:53 PM, Rob Coli wrote: >>> On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson >>> <edward.sargis...@globalrelay.net> wrote: >>>> If the downed node is a seed node then neither of the replace a dead node >>>> procedures work (-Dcassandra.replace_token and taking initial_token-1). The >>>> ring remains split. >>>> [...] >>>> In other words, if the host name is on the seeds list then it appears that >>>> the rest of the ring refuses to bootstrap it. >>> Close, but not exactly... >>> >>> "./src/java/org/apache/cassandra/service/StorageService.java" line 559 of >>> 3090 >>> " >>> if (DatabaseDescriptor.isAutoBootstrap() >>> && >>> DatabaseDescriptor.getSeeds().contains(FBUtilities.getBroadcastAddress()) >>> && !SystemTable.isBootstrapped()) >>> logger_.info("This node will not auto bootstrap because it >>> is configured to be a seed node."); >>> " >>> >>> getSeeds asks your seed provider for a list of seeds. If you are using >>> the SimpleSeedProvider, this basically turns the list from "seeds" in >>> cassandra.yaml on the local node into a list of hosts. >>> >>> So it isn't that the other nodes have this node in their seed list.. >>> it's that the node you are replacing has itself in its own seed list, >>> and shouldn't. I understand that it can be tricky in conf management >>> tools to make seed nodes' seed lists not contain themselves, but I >>> believe it is currently necessary in this case. >>> >>> FWIW, it's unclear to me (and Aaron Morton, whose curiousity was >>> apparently equally piqued and is looking into it further..) why >>> exactly seed nodes shouldn't bootstrap. It's possible that they only >>> shouldn't bootstrap without being in "hibernate" mode, and that the >>> code just hasn't been re-written post replace_token/hibernate to say >>> that it's ok for seed nodes to bootstrap as long as they hibernate... >>> >>> =Rob >>> >> >> -- >> Edward Sargisson >> senior java developer >> Global Relay >> >> edward.sargis...@globalrelay.net >> >> >> 866.484.6630 >> New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore >> (+65.3158.1301) >> >> Global Relay Archive supports email, instant messaging, BlackBerry, >> Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook >> and more. >> >> Ask about Global Relay Message — The Future of Collaboration in the >> Financial Services World >> >> All email sent to or from this address will be retained by Global Relay’s >> email archiving system. This message is intended only for the use of the >> individual or entity to which it is addressed, and may contain information >> that is privileged, confidential, and exempt from disclosure under >> applicable law. Global Relay will not be liable for any compliance or >> technical information provided herein. All trademarks are the property of >> their respective owners. >