the log file shows as follows, not sure what does 'Couldn't find cfId=1000'
means(google just returned useless results):
INFO [main] 2011-08-18 07:23:17,688 DatabaseDescriptor.java (line 453) Found
table data in data directories. Consider using JMX to call
org.apache.cassandra.service.StorageService.loadSchemaFromYaml().
INFO [main] 2011-08-18 07:23:17,705 CommitLogSegment.java (line 50)
Creating new commitlog segment
/cassandra/commitlog/CommitLog-1313670197705.log
INFO [main] 2011-08-18 07:23:17,716 CommitLog.java (line 155) Replaying
/cassandra/commitlog/CommitLog-1313670030512.log
INFO [main] 2011-08-18 07:23:17,734 CommitLog.java (line 314) Finished
reading /cassandra/commitlog/CommitLog-1313670030512.log
INFO [main] 2011-08-18 07:23:17,744 CommitLog.java (line 163) Log replay
complete
INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 364)
Cassandra version: 0.7.4
INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 365) Thrift
API version: 19.4.0
INFO [main] 2011-08-18 07:23:17,756 StorageService.java (line 378) Loading
persisted ring state
INFO [main] 2011-08-18 07:23:17,766 StorageService.java (line 414) Starting
up server gossip
INFO [main] 2011-08-18 07:23:17,771 ColumnFamilyStore.java (line 1048)
Enqueuing flush of Memtable-LocationInfo@832310230(29 bytes, 1 operations)
INFO [FlushWriter:1] 2011-08-18 07:23:17,772 Memtable.java (line 157)
Writing Memtable-LocationInfo@832310230(29 bytes, 1 operations)
INFO [FlushWriter:1] 2011-08-18 07:23:17,822 Memtable.java (line 164)
Completed flushing /cassandra/data/system/LocationInfo-f-66-Data.db (80
bytes)
INFO [CompactionExecutor:1] 2011-08-18 07:23:17,823 CompactionManager.java
(line 396) Compacting
[SSTableReader(path='/cassandra/data/system/LocationInfo-f-63-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-64-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-65-Data.db'),SSTableReader(path='/cassandra/data/system/LocationInfo-f-66-Data.db')]
INFO [main] 2011-08-18 07:23:17,853 StorageService.java (line 478) Using
saved token 113427455640312821154458202477256070484
INFO [main] 2011-08-18 07:23:17,854 ColumnFamilyStore.java (line 1048)
Enqueuing flush of Memtable-LocationInfo@18895884(53 bytes, 2 operations)
INFO [FlushWriter:1] 2011-08-18 07:23:17,854 Memtable.java (line 157)
Writing Memtable-LocationInfo@18895884(53 bytes, 2 operations)
ERROR [MutationStage:28] 2011-08-18 07:23:18,246 RowMutationVerbHandler.java
(line 86) Error in row mutation
org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find
cfId=1000
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:117)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
at
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:50)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
INFO [GossipStage:1] 2011-08-18 07:23:18,255 Gossiper.java (line 623) Node
/node1 has restarted, now UP again
ERROR [ReadStage:1] 2011-08-18 07:23:18,254
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.IllegalArgumentException: Unknown ColumnFamily prjcache in
keyspace prjkeyspace
at
org.apache.cassandra.config.DatabaseDescriptor.getComparator(DatabaseDescriptor.java:966)
at
org.apache.cassandra.db.ColumnFamily.getComparatorFor(ColumnFamily.java:388)
at
org.apache.cassandra.db.ReadCommand.getComparator(ReadCommand.java:93)
at
org.apache.cassandra.db.SliceByNamesReadCommand.<init>(SliceByNamesReadCommand.java:44)
at
org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:110)
at
org.apache.cassandra.db.ReadCommandSerializer.deserialize(ReadCommand.java:122)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:67)
On Fri, Aug 19, 2011 at 5:44 AM, aaron morton <[email protected]>wrote:
> Look in the logs to work find out why the migration did not get to node2.
>
> Otherwise yes you can drop those files.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/08/2011, at 11:25 PM, Yan Chunlu wrote:
>
> just found out that changes via cassandra-cli, the schema change didn't
> reach node2. and node2 became unreachable....
>
> I did as this document:
> http://wiki.apache.org/cassandra/FAQ#schema_disagreement
>
> but after that I just got two schema versons:
>
>
>
> ddcada52-c96a-11e0-99af-3bd951658d61: [node1, node3]
> 2127b2ef-6998-11e0-b45b-3bd951658d61: [node2]
>
>
> is that enough delete Schema* && Migrations* sstables and restart the node?
>
>
>
> On Thu, Aug 18, 2011 at 5:13 PM, Yan Chunlu <[email protected]> wrote:
>
>> thanks a lot for all the help! I have gone through the steps and
>> successfully brought up the node2 :)
>>
>>
>> On Thu, Aug 18, 2011 at 10:51 AM, Boris Yen <[email protected]> wrote:
>> > Because the file only preserve the "key" of records, not the whole
>> record.
>> > Records for those saved key will be loaded into cassandra during the
>> startup
>> > of cassandra.
>> >
>> > On Wed, Aug 17, 2011 at 5:52 PM, Yan Chunlu <[email protected]>
>> wrote:
>> >>
>> >> but the data size in the saved_cache are relatively small:
>> >>
>> >> will that cause the load problem?
>> >>
>> >> ls -lh /cassandra/saved_caches/
>> >> total 32M
>> >> -rw-r--r-- 1 cass cass 2.9M 2011-08-12 19:53
>> >> cass-CommentSortsCache-KeyCache
>> >> -rw-r--r-- 1 cass cass 2.9M 2011-08-17 04:29
>> >> cass-CommentSortsCache-RowCache
>> >> -rw-r--r-- 1 cass cass 2.7M 2011-08-12 18:50 cass-CommentVote-KeyCache
>> >> -rw-r--r-- 1 cass cass 140K 2011-08-12 19:53
>> cass-device_images-KeyCache
>> >> -rw-r--r-- 1 cass cass 33K 2011-08-12 18:51 cass-Hide-KeyCache
>> >> -rw-r--r-- 1 cass cass 4.6M 2011-08-12 19:53 cass-images-KeyCache
>> >> -rw-r--r-- 1 cass cass 2.6M 2011-08-12 19:53 cass-LinksByUrl-KeyCache
>> >> -rw-r--r-- 1 cass cass 2.5M 2011-08-12 18:50 cass-LinkVote-KeyCache
>> >> -rw-r--r-- 1 cass cass 7.5M 2011-08-12 18:50 cass-cache-KeyCache
>> >> -rw-r--r-- 1 cass cass 3.7M 2011-08-12 21:51 cass-cache-RowCache
>> >> -rw-r--r-- 1 cass cass 1.8M 2011-08-12 18:51 cass-Save-KeyCache
>> >> -rw-r--r-- 1 cass cass 111K 2011-08-12 19:50
>> cass-SavesByAccount-KeyCache
>> >> -rw-r--r-- 1 cass cass 864 2011-08-12 19:49 cass-VotesByDay-KeyCache
>> >> -rw-r--r-- 1 cass cass 249K 2011-08-12 19:49 cass-VotesByLink-KeyCache
>> >> -rw-r--r-- 1 cass cass 28 2011-08-14 12:50
>> >> system-HintsColumnFamily-KeyCache
>> >> -rw-r--r-- 1 cass cass 5 2011-08-14 12:50
>> system-LocationInfo-KeyCache
>> >> -rw-r--r-- 1 cass cass 54 2011-08-13 13:30 system-Migrations-KeyCache
>> >> -rw-r--r-- 1 cass cass 76 2011-08-13 13:30 system-Schema-KeyCache
>> >>
>> >> On Wed, Aug 17, 2011 at 4:31 PM, aaron morton <[email protected]
>> >
>> >> wrote:
>> >> > If you have a node that cannot start up due to issues loading the
>> saved
>> >> > cache delete the files in the saved_cache directory before starting
>> it.
>> >> >
>> >> > The settings to save the row and key cache are per CF. You can change
>> >> > them with an update column family statement via the CLI when attached
>> to any
>> >> > node. You may then want to check the saved_caches directory and
>> delete any
>> >> > files that are left (not sure if they are automatically deleted).
>> >> >
>> >> > i would recommend:
>> >> > - stop node 2
>> >> > - delete it's saved_cache
>> >> > - make the schema change via another node
>> >> > - startup node 2
>> >> >
>> >> > Cheers
>> >> >
>> >> > -----------------
>> >> > Aaron Morton
>> >> > Freelance Cassandra Developer
>> >> > @aaronmorton
>> >> > http://www.thelastpickle.com
>> >> >
>> >> > On 17/08/2011, at 2:59 PM, Yan Chunlu wrote:
>> >> >
>> >> >> does this need to be cluster wide? or I could just modify the caches
>> >> >> on one node? since I could not connect to the node with
>> >> >> cassandra-cli, it says "connection refused"
>> >> >>
>> >> >>
>> >> >> [default@unknown] connect node2/9160;
>> >> >> Exception connecting to node2/9160. Reason: Connection refused.
>> >> >>
>> >> >>
>> >> >> so if I change the cache size via other nodes, how could node2 be
>> >> >> notified the changing? kill cassandra and start it again could
>> make
>> >> >> it update the schema?
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Wed, Aug 17, 2011 at 5:59 AM, Teijo Holzer <[email protected]
>> >
>> >> >> wrote:
>> >> >>> Hi,
>> >> >>>
>> >> >>> yes, we saw exactly the same messages. We got rid of these by doing
>> >> >>> the
>> >> >>> following:
>> >> >>>
>> >> >>> * Set all row & key caches in your CFs to 0 via cassandra-cli
>> >> >>> * Kill Cassandra
>> >> >>> * Remove all files in the saved_caches directory
>> >> >>> * Start Cassandra
>> >> >>> * Slowly bring back row & key caches (if desired, we left them off)
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> T.
>> >> >>>
>> >> >>> On 16/08/11 23:35, Yan Chunlu wrote:
>> >> >>>>
>> >> >>>> I saw alot slicequeryfilter things if changed the log level to
>> >> >>>> DEBUG.
>> >> >>>> just
>> >> >>>> thought even bring up a new node will be faster than start the old
>> >> >>>> one..... it
>> >> >>>> is wired
>> >> >>>>
>> >> >>>> DEBUG [main] 2011-08-16 06:32:49,213 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:225@1313068845474382
>> >> >>>> DEBUG [main] 2011-08-16 06:32:49,245 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:453@1310999270198313
>> >> >>>> DEBUG [main] 2011-08-16 06:32:49,251 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:26@1313199902088827
>> >> >>>> DEBUG [main] 2011-08-16 06:32:49,576 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:157@1313097239332314
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,674 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647:
>> 76616c7565:false:41729@1313190821826229
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,811 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:6@1313174157301203
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,867 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:98@1312011362250907
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,881 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:42@1313201711997005
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,910 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:96@1312939986190155
>> >> >>>> DEBUG [main] 2011-08-16 06:32:50,954 SliceQueryFilter.java (line
>> 123)
>> >> >>>> collecting 0 of 2147483647: 76616c7565:false:621@1313192538616112
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Tue, Aug 16, 2011 at 7:32 PM, Yan Chunlu <
>> [email protected]
>> >> >>>> <mailto:[email protected]>> wrote:
>> >> >>>>
>> >> >>>> but it seems the row cache is cluster wide, how will the
>> change
>> >> >>>> of row
>> >> >>>> cache affect the read speed?
>> >> >>>>
>> >> >>>>
>> >> >>>> On Mon, Aug 15, 2011 at 7:33 AM, Jonathan Ellis <
>> [email protected]
>> >> >>>> <mailto:[email protected]>> wrote:
>> >> >>>>
>> >> >>>> Or leave row cache enabled but disable cache saving (and
>> >> >>>> remove the
>> >> >>>> one already on disk).
>> >> >>>>
>> >> >>>> On Sun, Aug 14, 2011 at 5:05 PM, aaron morton
>> >> >>>> <[email protected]
>> >> >>>> <mailto:[email protected]>> wrote:
>> >> >>>> > INFO [main] 2011-08-14 09:24:52,198
>> ColumnFamilyStore.java
>> >> >>>> (line 547)
>> >> >>>> > completed loading (1744370 ms; 200000 keys) row cache
>> for
>> >> >>>> COMMENT
>> >> >>>> >
>> >> >>>> > It's taking 29 minutes to load 200,000 rows in the row
>> >> >>>> cache.
>> >> >>>> Thats a
>> >> >>>> > pretty big row cache, I would suggest reducing or
>> disabling
>> >> >>>> it.
>> >> >>>> > Background
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra
>> >> >>>> >
>> >> >>>> > and server can not afford the load then crashed. after
>> come
>> >> >>>> back,
>> >> >>>> node 3 can
>> >> >>>> > not return for more than 96 hours
>> >> >>>> >
>> >> >>>> > Crashed how ?
>> >> >>>> > You may be seeing
>> >> >>>> https://issues.apache.org/jira/browse/CASSANDRA-2280
>> >> >>>> > Watch nodetool compactionstats to see when the Merkle
>> tree
>> >> >>>> build
>> >> >>>> finishes
>> >> >>>> > and nodetool netstats to see which CF's are streaming.
>> >> >>>> > Cheers
>> >> >>>> > -----------------
>> >> >>>> > Aaron Morton
>> >> >>>> > Freelance Cassandra Developer
>> >> >>>> > @aaronmorton
>> >> >>>> > http://www.thelastpickle.com
>> >> >>>> > On 15 Aug 2011, at 04:23, Yan Chunlu wrote:
>> >> >>>> >
>> >> >>>> >
>> >> >>>> > I got 3 nodes and RF=3, when I repairing ndoe3, it seems
>> >> >>>> alot
>> >> >>>> data
>> >> >>>> > generated. and server can not afford the load then
>> >> >>>> crashed.
>> >> >>>> > after come back, node 3 can not return for more than 96
>> >> >>>> hours
>> >> >>>> >
>> >> >>>> > for 34GB data, the node 2 could restart and back online
>> >> >>>> within 1
>> >> >>>> hour.
>> >> >>>> >
>> >> >>>> > I am not sure what's wrong with node3 and should I
>> restart
>> >> >>>> node
>> >> >>>> 3 again?
>> >> >>>> > thanks!
>> >> >>>> >
>> >> >>>> > Address Status State Load Owns
>> >> >>>> Token
>> >> >>>> >
>> >> >>>> > 113427455640312821154458202477256070484
>> >> >>>> > node1 Up Normal 34.11 GB 33.33% 0
>> >> >>>> > node2 Up Normal 31.44 GB 33.33%
>> >> >>>> > 56713727820156410577229101238628035242
>> >> >>>> > node3 Down Normal 177.55 GB 33.33%
>> >> >>>> > 113427455640312821154458202477256070484
>> >> >>>> >
>> >> >>>> >
>> >> >>>> > the log shows it is still going on, not sure why it is
>> so
>> >> >>>> slow:
>> >> >>>> >
>> >> >>>> >
>> >> >>>> > INFO [main] 2011-08-14 08:55:47,734 SSTableReader.java
>> >> >>>> (line
>> >> >>>> 154)
>> >> >>>> Opening
>> >> >>>> > /cassandra/data/COMMENT
>> >> >>>> > INFO [main] 2011-08-14 08:55:47,828
>> ColumnFamilyStore.java
>> >> >>>> (line 275)
>> >> >>>> > reading saved cache
>> >> >>>> /cassandra/saved_caches/COMMENT-RowCache
>> >> >>>> > INFO [main] 2011-08-14 09:24:52,198
>> ColumnFamilyStore.java
>> >> >>>> (line 547)
>> >> >>>> > completed loading (1744370 ms; 200000 keys) row cache
>> for
>> >> >>>> COMMENT
>> >> >>>> > INFO [main] 2011-08-14 09:24:52,299
>> ColumnFamilyStore.java
>> >> >>>> (line 275)
>> >> >>>> > reading saved cache
>> >> >>>> /cassandra/saved_caches/COMMENT-RowCache
>> >> >>>> > INFO [CompactionExecutor:1] 2011-08-14 10:24:55,480
>> >> >>>> CacheWriter.java (line
>> >> >>>> > 96) Saved COMMENT-RowCache (200000 items) in 2535 ms
>> >> >>>> >
>> >> >>>> >
>> >> >>>> >
>> >> >>>> >
>> >> >>>> >
>> >> >>>> >
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> --
>> >> >>>> Jonathan Ellis
>> >> >>>> Project Chair, Apache Cassandra
>> >> >>>> co-founder of DataStax, the source for professional
>> Cassandra
>> >> >>>> support
>> >> >>>> http://www.datastax.com
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >
>> >> >
>> >
>> >
>>
>>
>
>