the log file shows as follows, not sure what does 'Couldn't find cfId=1000' means(google just returned useless results):
INFO [main] 2011-08-18 07:23:17,688 DatabaseDescriptor.java (line 453) Found table data in data directories. Consider using JMX to call org.apache.cassandra.service.StorageService.loadSchemaFromYaml(). INFO [main] 2011-08-18 07:23:17,705 CommitLogSegment.java (line 50) Creating new commitlog segment /cassandra/commitlog/CommitLog-1313670197705.log INFO [main] 2011-08-18 07:23:17,716 CommitLog.java (line 155) Replaying /cassandra/commitlog/CommitLog-1313670030512.log INFO [main] 2011-08-18 07:23:17,734 CommitLog.java (line 314) Finished reading /cassandra/commitlog/CommitLog-1313670030512.log INFO [main] 2011-08-18 07:23:17,744 CommitLog.java (line 163) Log replay complete INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 364) Cassandra version: 0.7.4 INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 365) Thrift API version: 19.4.0 INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 378) Loading persisted ring state INFO [main] 2011-08-18 07:23:17,766 StorageService.java (line 414) Starting up server gossip INFO [main] 2011-08-18 07:23:17,771 ColumnFamilyStore.java (line 1048) Enqueuing flush of Memtable-LocationInfo@832310230(29 bytes, 1 operations) INFO [FlushWriter:1] 2011-08-18 07:23:17,772 Memtable.java (line 157) Writing Memtable-LocationInfo@832310230(29 bytes, 1 operations) INFO [FlushWriter:1] 2011-08-18 07:23:17,822 Memtable.java (line 164) Completed flushing /cassandra/data/system/LocationInfo-f-66-Data.db (80 bytes) INFO [CompactionExecutor:1] 2011-08-18 07:23:17,823 CompactionManager.java (line 396) Compacting [SSTableReader(path='/cassandra/data/system/LocationInfo-f-63-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-64-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-65-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-66-Data.db')] INFO [main] 2011-08-18 07:23:17,853 StorageService.java (line 478) Using saved token 113427455640312821154458202477256070484 INFO [main] 2011-08-18 07:23:17,854 ColumnFamilyStore.java (line 1048) Enqueuing flush of Memtable-LocationInfo@18895884(53 bytes, 2 operations) INFO [FlushWriter:1] 2011-08-18 07:23:17,854 Memtable.java (line 157) Writing Memtable-LocationInfo@18895884(53 bytes, 2 operations) ERROR [MutationStage:28] 2011-08-18 07:23:18,246 RowMutationVerbHandler.java (line 86) Error in row mutation org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1000 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:117) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:50) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) INFO [GossipStage:1] 2011-08-18 07:23:18,255 Gossiper.java (line 623) Node /node1 has restarted, now UP again ERROR [ReadStage:1] 2011-08-18 07:23:18,254 DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor java.lang.IllegalArgumentException: Unknown ColumnFamily prjcache in keyspace prjkeyspace at org.apache.cassandra.config.DatabaseDescriptor.getComparator(DatabaseDescriptor.java:966) at org.apache.cassandra.db.ColumnFamily.getComparatorFor(ColumnFamily.java:388) at org.apache.cassandra.db.ReadCommand.getComparator(ReadCommand.java:93) at org.apache.cassandra.db.SliceByNamesReadCommand.<init>(SliceByNamesReadCommand.java:44) at org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:110) at org.apache.cassandra.db.ReadCommandSerializer.deserialize(ReadCommand.java:122) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:67) On Fri, Aug 19, 2011 at 5:44 AM, aaron morton <aa...@thelastpickle.com>wrote: > Look in the logs to work find out why the migration did not get to node2. > > Otherwise yes you can drop those files. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 18/08/2011, at 11:25 PM, Yan Chunlu wrote: > > just found out that changes via cassandra-cli, the schema change didn't > reach node2. and node2 became unreachable.... > > I did as this document: > http://wiki.apache.org/cassandra/FAQ#schema_disagreement > > but after that I just got two schema versons: > > > > ddcada52-c96a-11e0-99af-3bd951658d61: [node1, node3] > 2127b2ef-6998-11e0-b45b-3bd951658d61: [node2] > > > is that enough delete Schema* && Migrations* sstables and restart the node? > > > > On Thu, Aug 18, 2011 at 5:13 PM, Yan Chunlu <springri...@gmail.com> wrote: > >> thanks a lot for all the help! I have gone through the steps and >> successfully brought up the node2 :) >> >> >> On Thu, Aug 18, 2011 at 10:51 AM, Boris Yen <yulin...@gmail.com> wrote: >> > Because the file only preserve the "key" of records, not the whole >> record. >> > Records for those saved key will be loaded into cassandra during the >> startup >> > of cassandra. >> > >> > On Wed, Aug 17, 2011 at 5:52 PM, Yan Chunlu <springri...@gmail.com> >> wrote: >> >> >> >> but the data size in the saved_cache are relatively small: >> >> >> >> will that cause the load problem? >> >> >> >> ls -lh /cassandra/saved_caches/ >> >> total 32M >> >> -rw-r--r-- 1 cass cass 2.9M 2011-08-12 19:53 >> >> cass-CommentSortsCache-KeyCache >> >> -rw-r--r-- 1 cass cass 2.9M 2011-08-17 04:29 >> >> cass-CommentSortsCache-RowCache >> >> -rw-r--r-- 1 cass cass 2.7M 2011-08-12 18:50 cass-CommentVote-KeyCache >> >> -rw-r--r-- 1 cass cass 140K 2011-08-12 19:53 >> cass-device_images-KeyCache >> >> -rw-r--r-- 1 cass cass 33K 2011-08-12 18:51 cass-Hide-KeyCache >> >> -rw-r--r-- 1 cass cass 4.6M 2011-08-12 19:53 cass-images-KeyCache >> >> -rw-r--r-- 1 cass cass 2.6M 2011-08-12 19:53 cass-LinksByUrl-KeyCache >> >> -rw-r--r-- 1 cass cass 2.5M 2011-08-12 18:50 cass-LinkVote-KeyCache >> >> -rw-r--r-- 1 cass cass 7.5M 2011-08-12 18:50 cass-cache-KeyCache >> >> -rw-r--r-- 1 cass cass 3.7M 2011-08-12 21:51 cass-cache-RowCache >> >> -rw-r--r-- 1 cass cass 1.8M 2011-08-12 18:51 cass-Save-KeyCache >> >> -rw-r--r-- 1 cass cass 111K 2011-08-12 19:50 >> cass-SavesByAccount-KeyCache >> >> -rw-r--r-- 1 cass cass 864 2011-08-12 19:49 cass-VotesByDay-KeyCache >> >> -rw-r--r-- 1 cass cass 249K 2011-08-12 19:49 cass-VotesByLink-KeyCache >> >> -rw-r--r-- 1 cass cass 28 2011-08-14 12:50 >> >> system-HintsColumnFamily-KeyCache >> >> -rw-r--r-- 1 cass cass 5 2011-08-14 12:50 >> system-LocationInfo-KeyCache >> >> -rw-r--r-- 1 cass cass 54 2011-08-13 13:30 system-Migrations-KeyCache >> >> -rw-r--r-- 1 cass cass 76 2011-08-13 13:30 system-Schema-KeyCache >> >> >> >> On Wed, Aug 17, 2011 at 4:31 PM, aaron morton <aa...@thelastpickle.com >> > >> >> wrote: >> >> > If you have a node that cannot start up due to issues loading the >> saved >> >> > cache delete the files in the saved_cache directory before starting >> it. >> >> > >> >> > The settings to save the row and key cache are per CF. You can change >> >> > them with an update column family statement via the CLI when attached >> to any >> >> > node. You may then want to check the saved_caches directory and >> delete any >> >> > files that are left (not sure if they are automatically deleted). >> >> > >> >> > i would recommend: >> >> > - stop node 2 >> >> > - delete it's saved_cache >> >> > - make the schema change via another node >> >> > - startup node 2 >> >> > >> >> > Cheers >> >> > >> >> > ----------------- >> >> > Aaron Morton >> >> > Freelance Cassandra Developer >> >> > @aaronmorton >> >> > http://www.thelastpickle.com >> >> > >> >> > On 17/08/2011, at 2:59 PM, Yan Chunlu wrote: >> >> > >> >> >> does this need to be cluster wide? or I could just modify the caches >> >> >> on one node? since I could not connect to the node with >> >> >> cassandra-cli, it says "connection refused" >> >> >> >> >> >> >> >> >> [default@unknown] connect node2/9160; >> >> >> Exception connecting to node2/9160. Reason: Connection refused. >> >> >> >> >> >> >> >> >> so if I change the cache size via other nodes, how could node2 be >> >> >> notified the changing? kill cassandra and start it again could >> make >> >> >> it update the schema? >> >> >> >> >> >> >> >> >> >> >> >> On Wed, Aug 17, 2011 at 5:59 AM, Teijo Holzer <thol...@wetafx.co.nz >> > >> >> >> wrote: >> >> >>> Hi, >> >> >>> >> >> >>> yes, we saw exactly the same messages. We got rid of these by doing >> >> >>> the >> >> >>> following: >> >> >>> >> >> >>> * Set all row & key caches in your CFs to 0 via cassandra-cli >> >> >>> * Kill Cassandra >> >> >>> * Remove all files in the saved_caches directory >> >> >>> * Start Cassandra >> >> >>> * Slowly bring back row & key caches (if desired, we left them off) >> >> >>> >> >> >>> Cheers, >> >> >>> >> >> >>> T. >> >> >>> >> >> >>> On 16/08/11 23:35, Yan Chunlu wrote: >> >> >>>> >> >> >>>> I saw alot slicequeryfilter things if changed the log level to >> >> >>>> DEBUG. >> >> >>>> just >> >> >>>> thought even bring up a new node will be faster than start the old >> >> >>>> one..... it >> >> >>>> is wired >> >> >>>> >> >> >>>> DEBUG [main] 2011-08-16 06:32:49,213 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:225@1313068845474382 >> >> >>>> DEBUG [main] 2011-08-16 06:32:49,245 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:453@1310999270198313 >> >> >>>> DEBUG [main] 2011-08-16 06:32:49,251 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:26@1313199902088827 >> >> >>>> DEBUG [main] 2011-08-16 06:32:49,576 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:157@1313097239332314 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,674 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: >> 76616c7565:false:41729@1313190821826229 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,811 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:6@1313174157301203 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,867 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:98@1312011362250907 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,881 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:42@1313201711997005 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,910 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:96@1312939986190155 >> >> >>>> DEBUG [main] 2011-08-16 06:32:50,954 SliceQueryFilter.java (line >> 123) >> >> >>>> collecting 0 of 2147483647: 76616c7565:false:621@1313192538616112 >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> On Tue, Aug 16, 2011 at 7:32 PM, Yan Chunlu < >> springri...@gmail.com >> >> >>>> <mailto:springri...@gmail.com>> wrote: >> >> >>>> >> >> >>>> but it seems the row cache is cluster wide, how will the >> change >> >> >>>> of row >> >> >>>> cache affect the read speed? >> >> >>>> >> >> >>>> >> >> >>>> On Mon, Aug 15, 2011 at 7:33 AM, Jonathan Ellis < >> jbel...@gmail.com >> >> >>>> <mailto:jbel...@gmail.com>> wrote: >> >> >>>> >> >> >>>> Or leave row cache enabled but disable cache saving (and >> >> >>>> remove the >> >> >>>> one already on disk). >> >> >>>> >> >> >>>> On Sun, Aug 14, 2011 at 5:05 PM, aaron morton >> >> >>>> <aa...@thelastpickle.com >> >> >>>> <mailto:aa...@thelastpickle.com>> wrote: >> >> >>>> > INFO [main] 2011-08-14 09:24:52,198 >> ColumnFamilyStore.java >> >> >>>> (line 547) >> >> >>>> > completed loading (1744370 ms; 200000 keys) row cache >> for >> >> >>>> COMMENT >> >> >>>> > >> >> >>>> > It's taking 29 minutes to load 200,000 rows in the row >> >> >>>> cache. >> >> >>>> Thats a >> >> >>>> > pretty big row cache, I would suggest reducing or >> disabling >> >> >>>> it. >> >> >>>> > Background >> >> >>>> >> >> >>>> >> >> >>>> >> http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra >> >> >>>> > >> >> >>>> > and server can not afford the load then crashed. after >> come >> >> >>>> back, >> >> >>>> node 3 can >> >> >>>> > not return for more than 96 hours >> >> >>>> > >> >> >>>> > Crashed how ? >> >> >>>> > You may be seeing >> >> >>>> https://issues.apache.org/jira/browse/CASSANDRA-2280 >> >> >>>> > Watch nodetool compactionstats to see when the Merkle >> tree >> >> >>>> build >> >> >>>> finishes >> >> >>>> > and nodetool netstats to see which CF's are streaming. >> >> >>>> > Cheers >> >> >>>> > ----------------- >> >> >>>> > Aaron Morton >> >> >>>> > Freelance Cassandra Developer >> >> >>>> > @aaronmorton >> >> >>>> > http://www.thelastpickle.com >> >> >>>> > On 15 Aug 2011, at 04:23, Yan Chunlu wrote: >> >> >>>> > >> >> >>>> > >> >> >>>> > I got 3 nodes and RF=3, when I repairing ndoe3, it seems >> >> >>>> alot >> >> >>>> data >> >> >>>> > generated. and server can not afford the load then >> >> >>>> crashed. >> >> >>>> > after come back, node 3 can not return for more than 96 >> >> >>>> hours >> >> >>>> > >> >> >>>> > for 34GB data, the node 2 could restart and back online >> >> >>>> within 1 >> >> >>>> hour. >> >> >>>> > >> >> >>>> > I am not sure what's wrong with node3 and should I >> restart >> >> >>>> node >> >> >>>> 3 again? >> >> >>>> > thanks! >> >> >>>> > >> >> >>>> > Address Status State Load Owns >> >> >>>> Token >> >> >>>> > >> >> >>>> > 113427455640312821154458202477256070484 >> >> >>>> > node1 Up Normal 34.11 GB 33.33% 0 >> >> >>>> > node2 Up Normal 31.44 GB 33.33% >> >> >>>> > 56713727820156410577229101238628035242 >> >> >>>> > node3 Down Normal 177.55 GB 33.33% >> >> >>>> > 113427455640312821154458202477256070484 >> >> >>>> > >> >> >>>> > >> >> >>>> > the log shows it is still going on, not sure why it is >> so >> >> >>>> slow: >> >> >>>> > >> >> >>>> > >> >> >>>> > INFO [main] 2011-08-14 08:55:47,734 SSTableReader.java >> >> >>>> (line >> >> >>>> 154) >> >> >>>> Opening >> >> >>>> > /cassandra/data/COMMENT >> >> >>>> > INFO [main] 2011-08-14 08:55:47,828 >> ColumnFamilyStore.java >> >> >>>> (line 275) >> >> >>>> > reading saved cache >> >> >>>> /cassandra/saved_caches/COMMENT-RowCache >> >> >>>> > INFO [main] 2011-08-14 09:24:52,198 >> ColumnFamilyStore.java >> >> >>>> (line 547) >> >> >>>> > completed loading (1744370 ms; 200000 keys) row cache >> for >> >> >>>> COMMENT >> >> >>>> > INFO [main] 2011-08-14 09:24:52,299 >> ColumnFamilyStore.java >> >> >>>> (line 275) >> >> >>>> > reading saved cache >> >> >>>> /cassandra/saved_caches/COMMENT-RowCache >> >> >>>> > INFO [CompactionExecutor:1] 2011-08-14 10:24:55,480 >> >> >>>> CacheWriter.java (line >> >> >>>> > 96) Saved COMMENT-RowCache (200000 items) in 2535 ms >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> -- >> >> >>>> Jonathan Ellis >> >> >>>> Project Chair, Apache Cassandra >> >> >>>> co-founder of DataStax, the source for professional >> Cassandra >> >> >>>> support >> >> >>>> http://www.datastax.com >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>> >> >> >>> >> >> > >> >> > >> > >> > >> >> > >