Hi Aaron, I repeat the whole procedure:
1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0-0000-9510c23fceff is same as last timeā¦ And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0-0000-9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146100000500000000rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: > did you check the logs in 1.27 for errors ? > > Could you be seeing this ? > https://issues.apache.org/jira/browse/CASSANDRA-2867 > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > > > > > On 7 Aug 2011, at 16:24, Dikang Gu wrote: > > I restart both nodes, and deleted the shcema* and migration* and restarted > > them. > > > > The current cluster looks like this: > > [default@unknown] describe cluster; > > Cluster Information: > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > Schema versions: > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, > > 192.168.1.25] > > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > the 1.28 looks good, and the 1.27 still can not get the schema agreement... > > > > I have tried several times, even delete all the data on 1.27, and rejoin it > > as a new node, but it is still unhappy. > > > > And the ring looks like this: > > > > Address DC Rack Status State Load Owns Token > > 127605887595351923798765477786913079296 > > 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 > > 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB 34.01% > > 57856537434773737201679995572503935972 > > 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB 24.28% > > 99165710459060760249270263771474737125 > > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > > 127605887595351923798765477786913079296 > > > > > > The 1.27 seems can not join the cluster, and it just hangs there... > > > > Any suggestions? > > > > Thanks. > > > > > > On Sun, Aug 7, 2011 at 10:01 AM, aaron morton <aa...@thelastpickle.com> > > wrote: > > > After there restart you what was in the logs for the 1.27 machine from > > > the Migration.java logger ? Some of the messages will start with > > > "Applying migration" > > > > > > You should have shut down both of the nodes, then deleted the schema* and > > > migration* system sstables, then restarted one of them and watched to see > > > if it got to schema agreement. > > > > > > Cheers > > > > > > ----------------- > > > Aaron Morton > > > Freelance Cassandra Developer > > > @aaronmorton > > > http://www.thelastpickle.com > > > > > > > > > > > > > > > > > > On 6 Aug 2011, at 22:56, Dikang Gu wrote: > > > > I have tried this, but the schema still does not agree in the cluster: > > > > > > > > [default@unknown] describe cluster; > > > > Cluster Information: > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > Schema versions: > > > > UNREACHABLE: [192.168.1.28] > > > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25] > > > > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > > > Any other suggestions to solve this? > > > > > > > > Because I have some production data saved in the cassandra cluster, so > > > > I can not afford data lost... > > > > > > > > Thanks. > > > > On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud <ben...@noisette.ch> > > > > wrote: > > > > > Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, > > > > > 75eece10-bf48-11e0-0000-4d205df954a7 own the majority, so shutdown > > > > > and > > > > > remove the schema* and migration* sstables from both 192.168.1.28 and > > > > > 192.168.1.27 > > > > > > > > > > > > > > > 2011/8/5 Dikang Gu <dikan...@gmail.com>: > > > > > > [default@unknown] describe cluster; > > > > > > Cluster Information: > > > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > > > Schema versions: > > > > > > 743fe590-bf48-11e0-0000-4d205df954a7: [192.168.1.28] > > > > > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25] > > > > > > 06da9aa0-bda8-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > > > > > > > three different schema versions in the cluster... > > > > > > -- > > > > > > Dikang Gu > > > > > > 0086 - 18611140205 > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Dikang Gu > > > > > > > > 0086 - 18611140205 > > > > > > > > > > > > -- > > Dikang Gu > > > > 0086 - 18611140205 > > >