um. There has got to be something stopping the migration from completing. Turn the logging up to DEBUG before starting and look for messages from MigrationManager.java
Provide all the log messages from Migration.java on the 1.27 node Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Aug 2011, at 15:52, Dikang Gu wrote: > Hi Aaron, > > I repeat the whole procedure: > > 1. kill the cassandra instance on 1.27. > 2. rm the data/system/Migrations-g-* > 3. rm the data/system/Schema-g-* > 4. bin/cassandra to start the cassandra. > > Now, the migration seems stop and I do not find any error in the system.log > yet. > > The ring looks good: > [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 > ring > Address DC Rack Status State Load Owns > Token > > 127605887595351923798765477786913079296 > 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% > 1 > 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% > 57856537434773737201679995572503935972 > 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% > 99165710459060760249270263771474737125 > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > 127605887595351923798765477786913079296 > > But the schema still does not correct: > Cluster Information: > Snitch: org.apache.cassandra.locator.SimpleSnitch > Partitioner: org.apache.cassandra.dht.RandomPartitioner > Schema versions: > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, > 192.168.1.25] > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > The 5a54ebd0-bd90-11e0-0000-9510c23fceff is same as last timeā¦ > > And in the log, the last Migration.java log is: > INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) > Applying migration 5a54ebd0-bd90-11e0-0000-9510c23fceff Add keyspace: > SimpleDB_4E38DAA64894A9146100000500000000rep > strategy:SimpleStrategy{}durable_writes: true > > Could you explain this? > > If I change the token given to 1.27 to another one, will it help? > > Thanks. > > -- > Dikang Gu > 0086 - 18611140205 > On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: > >> did you check the logs in 1.27 for errors ? >> >> Could you be seeing this ? >> https://issues.apache.org/jira/browse/CASSANDRA-2867 >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 7 Aug 2011, at 16:24, Dikang Gu wrote: >> >>> I restart both nodes, and deleted the shcema* and migration* and restarted >>> them. >>> >>> The current cluster looks like this: >>> [default@unknown] describe cluster; >>> Cluster Information: >>> Snitch: org.apache.cassandra.locator.SimpleSnitch >>> Partitioner: org.apache.cassandra.dht.RandomPartitioner >>> Schema versions: >>> 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, >>> 192.168.1.25] >>> 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] >>> >>> the 1.28 looks good, and the 1.27 still can not get the schema agreement... >>> >>> I have tried several times, even delete all the data on 1.27, and rejoin it >>> as a new node, but it is still unhappy. >>> >>> And the ring looks like this: >>> >>> Address DC Rack Status State Load Owns >>> Token >>> >>> 127605887595351923798765477786913079296 >>> 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB >>> 25.00% 1 >>> 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB >>> 34.01% 57856537434773737201679995572503935972 >>> 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB >>> 24.28% 99165710459060760249270263771474737125 >>> 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB >>> 16.72% 127605887595351923798765477786913079296 >>> >>> The 1.27 seems can not join the cluster, and it just hangs there... >>> >>> Any suggestions? >>> >>> Thanks. >>> >>> >>> On Sun, Aug 7, 2011 at 10:01 AM, aaron morton <aa...@thelastpickle.com> >>> wrote: >>> After there restart you what was in the logs for the 1.27 machine from >>> the Migration.java logger ? Some of the messages will start with "Applying >>> migration" >>> >>> You should have shut down both of the nodes, then deleted the schema* and >>> migration* system sstables, then restarted one of them and watched to see >>> if it got to schema agreement. >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 6 Aug 2011, at 22:56, Dikang Gu wrote: >>> >>>> I have tried this, but the schema still does not agree in the cluster: >>>> >>>> [default@unknown] describe cluster; >>>> Cluster Information: >>>> Snitch: org.apache.cassandra.locator.SimpleSnitch >>>> Partitioner: org.apache.cassandra.dht.RandomPartitioner >>>> Schema versions: >>>> UNREACHABLE: [192.168.1.28] >>>> 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25] >>>> 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] >>>> >>>> Any other suggestions to solve this? >>>> >>>> Because I have some production data saved in the cassandra cluster, so I >>>> can not afford data lost... >>>> >>>> Thanks. >>>> >>>> On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud <ben...@noisette.ch> wrote: >>>>> Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, >>>>> 75eece10-bf48-11e0-0000-4d205df954a7 own the majority, so shutdown and >>>>> remove the schema* and migration* sstables from both 192.168.1.28 and >>>>> 192.168.1.27 >>>>> >>>>> >>>>> 2011/8/5 Dikang Gu <dikan...@gmail.com>: >>>>> > [default@unknown] describe cluster; >>>>> > Cluster Information: >>>>> > Snitch: org.apache.cassandra.locator.SimpleSnitch >>>>> > Partitioner: org.apache.cassandra.dht.RandomPartitioner >>>>> > Schema versions: >>>>> > 743fe590-bf48-11e0-0000-4d205df954a7: [192.168.1.28] >>>>> > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25] >>>>> > 06da9aa0-bda8-11e0-0000-9510c23fceff: [192.168.1.27] >>>>> > >>>>> > three different schema versions in the cluster... >>>>> > -- >>>>> > Dikang Gu >>>>> > 0086 - 18611140205 >>>>> > >>>> >>>> >>>> >>>> -- >>>> Dikang Gu >>>> >>>> 0086 - 18611140205 >>>> >>> >>> >>> >>> >>> -- >>> Dikang Gu >>> >>> 0086 - 18611140205 >>> >> >