Hi Aaron, I set the log level to be DEBUG, and find a lot of forceFlush debug info in the log:
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: > um. There has got to be something stopping the migration from completing. > > Turn the logging up to DEBUG before starting and look for messages from > MigrationManager.java > > Provide all the log messages from Migration.java on the 1.27 node > > Cheers > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > > > > > On 8 Aug 2011, at 15:52, Dikang Gu wrote: > > Hi Aaron, > > > > I repeat the whole procedure: > > > > 1. kill the cassandra instance on 1.27. > > 2. rm the data/system/Migrations-g-* > > 3. rm the data/system/Schema-g-* > > 4. bin/cassandra to start the cassandra. > > > > Now, the migration seems stop and I do not find any error in the system.log > > yet. > > > > The ring looks good: > > [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 > > ring > > Address DC Rack Status State Load Owns Token > > 127605887595351923798765477786913079296 > > 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 > > 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% > > 57856537434773737201679995572503935972 > > 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% > > 99165710459060760249270263771474737125 > > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > > 127605887595351923798765477786913079296 > > > > > > But the schema still does not correct: > > Cluster Information: > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > Schema versions: > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, > > 192.168.1.25] > > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > The 5a54ebd0-bd90-11e0-0000-9510c23fceff is same as last time⦠> > > > And in the log, the last Migration.java log is: > > INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) > > Applying migration 5a54ebd0-bd90-11e0-0000-9510c23fceff Add keyspace: > > SimpleDB_4E38DAA64894A9146100000500000000rep > > strategy:SimpleStrategy{}durable_writes: true > > > > Could you explain this? > > > > If I change the token given to 1.27 to another one, will it help? > > > > Thanks. > > -- > > Dikang Gu > > 0086 - 18611140205 > > On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: > > > did you check the logs in 1.27 for errors ? > > > > > > Could you be seeing this ? > > > https://issues.apache.org/jira/browse/CASSANDRA-2867 > > > > > > Cheers > > > > > > ----------------- > > > Aaron Morton > > > Freelance Cassandra Developer > > > @aaronmorton > > > http://www.thelastpickle.com > > > > > > > > > > > > > > > > > > On 7 Aug 2011, at 16:24, Dikang Gu wrote: > > > > I restart both nodes, and deleted the shcema* and migration* and > > > > restarted them. > > > > > > > > The current cluster looks like this: > > > > [default@unknown] describe cluster; > > > > Cluster Information: > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > Schema versions: > > > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.28, 192.168.1.9, > > > > 192.168.1.25] > > > > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > > > > > > > the 1.28 looks good, and the 1.27 still can not get the schema > > > > agreement... > > > > > > > > I have tried several times, even delete all the data on 1.27, and > > > > rejoin it as a new node, but it is still unhappy. > > > > > > > > And the ring looks like this: > > > > > > > > Address DC Rack Status State Load Owns Token > > > > 127605887595351923798765477786913079296 > > > > 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 > > > > 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB 34.01% > > > > 57856537434773737201679995572503935972 > > > > 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB 24.28% > > > > 99165710459060760249270263771474737125 > > > > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > > > > 127605887595351923798765477786913079296 > > > > > > > > > > > > The 1.27 seems can not join the cluster, and it just hangs there... > > > > > > > > Any suggestions? > > > > > > > > Thanks. > > > > > > > > > > > > On Sun, Aug 7, 2011 at 10:01 AM, aaron morton <aa...@thelastpickle.com> > > > > wrote: > > > > > After there restart you what was in the logs for the 1.27 machine > > > > > from the Migration.java logger ? Some of the messages will start with > > > > > "Applying migration" > > > > > > > > > > You should have shut down both of the nodes, then deleted the schema* > > > > > and migration* system sstables, then restarted one of them and > > > > > watched to see if it got to schema agreement. > > > > > > > > > > Cheers > > > > > > > > > > ----------------- > > > > > Aaron Morton > > > > > Freelance Cassandra Developer > > > > > @aaronmorton > > > > > http://www.thelastpickle.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 6 Aug 2011, at 22:56, Dikang Gu wrote: > > > > > > I have tried this, but the schema still does not agree in the > > > > > > cluster: > > > > > > > > > > > > [default@unknown] describe cluster; > > > > > > Cluster Information: > > > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > > > Schema versions: > > > > > > UNREACHABLE: [192.168.1.28] > > > > > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25] > > > > > > 5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > > > > > > > Any other suggestions to solve this? > > > > > > > > > > > > Because I have some production data saved in the cassandra cluster, > > > > > > so I can not afford data lost... > > > > > > > > > > > > Thanks. > > > > > > On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud <ben...@noisette.ch> > > > > > > wrote: > > > > > > > Based on > > > > > > > http://wiki.apache.org/cassandra/FAQ#schema_disagreement, > > > > > > > 75eece10-bf48-11e0-0000-4d205df954a7 own the majority, so > > > > > > > shutdown and > > > > > > > remove the schema* and migration* sstables from both > > > > > > > 192.168.1.28 and > > > > > > > 192.168.1.27 > > > > > > > > > > > > > > > > > > > > > 2011/8/5 Dikang Gu <dikan...@gmail.com>: > > > > > > > > [default@unknown] describe cluster; > > > > > > > > Cluster Information: > > > > > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > > > > > Schema versions: > > > > > > > > 743fe590-bf48-11e0-0000-4d205df954a7: [192.168.1.28] > > > > > > > > 75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, > > > > > > > > 192.168.1.25] > > > > > > > > 06da9aa0-bda8-11e0-0000-9510c23fceff: [192.168.1.27] > > > > > > > > > > > > > > > > three different schema versions in the cluster... > > > > > > > > -- > > > > > > > > Dikang Gu > > > > > > > > 0086 - 18611140205 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Dikang Gu > > > > > > > > > > > > 0086 - 18611140205 > > > > > > > > > > > > > > > > > > > > > > -- > > > > Dikang Gu > > > > > > > > 0086 - 18611140205 > > > > > > > > > >