Cassandra 1.0 hangs during GC
This is a cluster of 2 nodes, each having 8G of operating memory, replicationfactor=2 Write/read pressure is quite low and almost never exceeds 10/second >From time to time (2-3 times in a month) I see GC activity in logs and for >this time cassandra stops responding to requests which results in a timeout in >upper-layer application. Total time of unavailability can be over 5 minues >(like in the following case) What can I do with that? Wiil it become much worse when my cluster grows up? INFO [GossipTasks:1] 2012-07-16 13:10:50,055 Gossiper.java (line 736) InetAddress /10.220.50.9 is now dead. INFO [ScheduledTasks:1] 2012-07-16 13:10:50,056 GCInspector.java (line 123) GC for ParNew: 391383 ms for 1 collections, 2025808488 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,067 StatusLogger.java (line 50) Pool Name Active Pending Blocked INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) ReadStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) RequestResponseStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) ReadRepairStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) MutationStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,073 StatusLogger.java (line 65) ReplicateOnWriteStage 0 0 0 INFO [GossipStage:1] 2012-07-16 13:10:50,074 Gossiper.java (line 722) InetAddress /10.220.50.9 is now UP INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) GossipStage 1 59 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) AntiEntropyStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) MigrationStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) StreamStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) MemtablePostFlusher 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) FlushWriter 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) MiscStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) InternalResponseStage 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) HintedHandoff 0 0 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 69) CompactionManager n/a 0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 81) MessagingService n/a 0,0 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 85) ColumnFamily Memtable ops,data Row cache size/cap Key cache size/cap INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 88) Keyspace1.PSS 36712,343842617 0/0 97995/100 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) Keyspace1.Standard1 128679,162567721 0/0 0/100 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) system.NodeIdInfo 0,0 0/0 0/1 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.IndexInfo 0,0 0/0 0/1 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.LocationInfo 0,0 0/0 1/2 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.Migrations 0,0 0/0 3/3 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) system.HintsColumnFamily 1,528 0/0 0/1 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) system.Schema 0,0 0/0 2/3 Завершение пересылаемого сообщения
Re: Cassandra 1.0 hangs during GC
21th I have mirgated to cassandra 1.1.2 but see no improvement cat /var/log/cassandra/Earth1.log | grep "GC for" INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 8464105472 === Migrated from 1.0.0 to 1.1.2 INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472 Memory and yaml memory-related settings are default I do not do deletes I have 2 CF's and no secondary indexes LiveRatio's: INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). calculation took 85ms for 6236 columns INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). calculation took 8ms for 1 columns INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). calculation took 99ms for 1094 columns INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). calculation took 80ms for 242 columns INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). calculation took 505ms for 2678 columns INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). calculation took 776ms for 5236 columns INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). calculation took 64ms for 389 columns INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). calculation took 1378ms for 8948 columns INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). calculation took 140ms for 1082 columns INFO [MemoryMeter:1] 2012-07-21 09:05:08,439 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 2.5763038186160894 (just-counted was 2.5763038186160894). calculation took 8796ms for 102193 columns 18.07.2012, 07:51, "aaron morton" : > Assuming all the memory and yaml settings default that does not sound > right. The first thought would be the memory meter not counting correctly... > Do you do a lot of deletes ? > Do you have a lot of CF's and/or secondary indexes ? > Can you see log lines about the "liveRatio" for your cf's ? > I would upgrade to 1.0.10 before getting too carried away though. > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 17/07/2012, at 8:14 PM, Nikolay Kоvshov wrote: > >> This is a cluster of 2 nodes, each having 8G of operating memory, >> replicationfactor=2 >> Write/read pressure is quite low and almost never exceeds 10/second >> >> From time to time (2-3 times in a month) I see GC activity in logs and for >> this time cassandra stops responding to requests which results in a timeout >> in upper-layer applicat
nodetool move causes summary load to grow
I have a testing cluster cassandra 1.1.2 with default memory and cache settings, 1 CF, 1 KS, RF = 2 This is an empty cluster 10.111.1.141datacenter1 rack1 Up Normal 43.04 KB100.00% 0 10.111.1.142datacenter1 rack1 Up Normal 26.95 KB100.00% 85070591730234615865843651857942052864 Then I fill it with 200 MB of random data 10.111.1.141datacenter1 rack1 Up Normal 153.34 MB 100.00% 0 10.111.1.142datacenter1 rack1 Up Normal 76.94 MB100.00% 85070591730234615865843651857942052864 Though second token is something that your python script calculates for equal token distribution and keys were random, well, this is still looks reasonable Now I start moving nodes. I wish them change places nodetool -h d-st-n2 move 1 nodetool -h d-st-n2 cleanup Here I expect for the second node to have a load of almost 0, but this does not happen 10.111.1.141datacenter1 rack1 Up Normal 195.53 MB 100.00% 0 10.111.1.142datacenter1 rack1 Up Normal 249.82 MB 100.00% 1 nodetool -h d-st-n1 move 85070591730234615865843651857942052864 nodetool -h d-st-n1 cleanup Nodes did not change places, but the memory grows bigger 10.111.1.142datacenter1 rack1 Up Normal 271.2 MB100.00% 1 10.111.1.141datacenter1 rack1 Up Normal 195.53 MB 100.00% 85070591730234615865843651857942052864 nodetool -h d-st-n1 move z nodetool -h d-st-n2 move a nodetool -h d-st-n1 move 0 nodetool -h d-st-n2 move 85070591730234615865843651857942052864 nodetool -h d-st-n1 cleanup && nodetool -h d-st-n2 cleanup Here I come to the original cluster, but now it is much bigger 10.111.1.141datacenter1 rack1 Up Normal 353.09 MB 100.00% 0 10.111.1.142datacenter1 rack1 Up Normal 195.58 MB 100.00% 85070591730234615865843651857942052864 Could you help me understand the effect of nodetool move ? I see no logic in these results.
Re: Cassandra 1.0 hangs during GC
48 G of Ram on that machine, swap is not used. I will disable swap at all just in case I have 4 cassandra processes (parts of 4 different clusters), each allocated 8 GB and using 4 of them >java -version java version "1.7.0" Java(TM) SE Runtime Environment (build 1.7.0-b147) Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) 23.07.2012, 20:12, "Joost van de Wijgerd" : > Howmuch memory do you have on the machine. Seems like you have 8G > reserved for the Cassandra java process, If this is all the memory on > the machine you might be swapping. Also which jvm do you use? > > kind regards > > Joost > > On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov wrote: > >> 21th I have mirgated to cassandra 1.1.2 but see no improvement >> >> cat /var/log/cassandra/Earth1.log | grep "GC for" >> INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) >> GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) >> GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) >> GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) >> GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is >> 8464105472 >> INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) >> GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is >> 8464105472 >> === Migrated from 1.0.0 to 1.1.2 >> INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) >> GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) >> GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) >> GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472 >> INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) >> GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472 >> >> Memory and yaml memory-related settings are default >> I do not do deletes >> I have 2 CF's and no secondary indexes >> >> LiveRatio's: >> INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) >> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted >> was 1.0). calculation took 85ms for 6236 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >> (just-counted was 1.0). calculation took 8ms for 1 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted >> was 1.0). calculation took 99ms for 1094 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >> (just-counted was 1.0). calculation took 80ms for 242 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted >> was 1.0). calculation took 505ms for 2678 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted >> was 1.0). calculation took 776ms for 5236 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >> (just-counted was 1.0). calculation took 64ms for 389 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted >> was 1.0). calculation took 1378ms for 8948 columns >> INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) >> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >> (just-counted was 1.0). calculation took 140ms for 1082 columns >> INFO [MemoryMeter:1] 2012-07-21 09:
Re: Cassandra 1.0 hangs during GC
I ran sar only recently after your advice and did not meet any huge GC-s on that server At 08:14 there was a GC lasting 4.5 seconds, that's not five minutes of course, but also quite an unpleasant value; Still I'm waiting for big GC values and will provide according sar logs. 07:25:01 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s%vmeff 07:35:01 PM 0.00 4.34 9.02 0.00 13.61 0.00 0.00 0.00 0.00 07:45:01 PM 0.00 5.17 20.47 0.00 25.77 0.00 0.00 0.00 0.00 07:55:01 PM 0.00 4.66 8.63 0.00 18.69 0.00 0.00 0.00 0.00 08:05:01 PM 0.00 8.11 8.84 0.00 14.37 0.00 0.00 0.00 0.00 08:15:01 PM 0.00 5.19 21.65 0.00 25.94 0.00 0.00 0.00 0.00 24.07.2012, 10:22, "Wojciech Meler" : > Can you provide output from sar command for the time period when long > GC occurred ? > > Regards, > Wojciech Meler
Fwd: Cassandra 1.0 hangs during GC
Sun Java 6 didn't help it at all Sar shows no special activity on the long GC times And I have remembered that I do a lot of get_range_slices requests, each time on a range of 100, if this could be important. Still my GC times are growing incredible INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) GC for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) GC for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) GC for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) GC for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472 24.07.2012, 13:07, "Joost van de Wijgerd" : > You are better off using Sun Java 6 to run Cassandra. In the past > there were issues reported on 7. Can you try running it on Sun Java 6? > > kind regards > > Joost > > On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov wrote: >> 48 G of Ram on that machine, swap is not used. I will disable swap at all >> just in case >> I have 4 cassandra processes (parts of 4 different clusters), each >> allocated 8 GB and using 4 of them >>> java -version >> java version "1.7.0" >> Java(TM) SE Runtime Environment (build 1.7.0-b147) >> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) >> >> 23.07.2012, 20:12, "Joost van de Wijgerd" : >>> Howmuch memory do you have on the machine. Seems like you have 8G >>> reserved for the Cassandra java process, If this is all the memory on >>> the machine you might be swapping. Also which jvm do you use? >>> >>> kind regards >>> >>> Joost >>> >>> On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov >>> wrote: >>>> 21th I have mirgated to cassandra 1.1.2 but see no improvement >>>> >>>> cat /var/log/cassandra/Earth1.log | grep "GC for" >>>> INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line >>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line >>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line >>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line >>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line >>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is >>>> 8464105472 >>>> === Migrated from 1.0.0 to 1.1.2 >>>> INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line >>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line >>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line >>>> 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line >>>> 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is >>>> 8464105472 >>>> >>>> Memory and yaml memory-related settings are default >>>> I do not do deletes >>>> I have 2 CF's and no secondary indexes >>>> >>>> LiveRatio's: >>>> INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line >>>> 177) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 >>>> (just-counted was 1.0). calculation took 85ms for 6236 columns >>>> INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) >>>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >>>> (just-counted was 1.0). calc
Re: Cassandra 1.0 hangs during GC
Sun Java 6 didn't help it at all Sar shows no special activity on the long GC times And I have remembered that I do a lot of get_range_slices requests, each time on a range of 100, if this could be important. Still my GC times are growing incredible INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) GC for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) GC for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) GC for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472 INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) GC for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472 24.07.2012, 13:07, "Joost van de Wijgerd" : > You are better off using Sun Java 6 to run Cassandra. In the past > there were issues reported on 7. Can you try running it on Sun Java 6? > > kind regards > > Joost > > On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov wrote: > >> 48 G of Ram on that machine, swap is not used. I will disable swap at all >> just in case >> I have 4 cassandra processes (parts of 4 different clusters), each >> allocated 8 GB and using 4 of them >>> java -version >> java version "1.7.0" >> Java(TM) SE Runtime Environment (build 1.7.0-b147) >> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) >> >> 23.07.2012, 20:12, "Joost van de Wijgerd" : >>> Howmuch memory do you have on the machine. Seems like you have 8G >>> reserved for the Cassandra java process, If this is all the memory on >>> the machine you might be swapping. Also which jvm do you use? >>> >>> kind regards >>> >>> Joost >>> >>> On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov >>> wrote: >>>> 21th I have mirgated to cassandra 1.1.2 but see no improvement >>>> >>>> cat /var/log/cassandra/Earth1.log | grep "GC for" >>>> INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line >>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line >>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line >>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line >>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line >>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is >>>> 8464105472 >>>> === Migrated from 1.0.0 to 1.1.2 >>>> INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line >>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line >>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line >>>> 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is >>>> 8464105472 >>>> INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line >>>> 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is >>>> 8464105472 >>>> >>>> Memory and yaml memory-related settings are default >>>> I do not do deletes >>>> I have 2 CF's and no secondary indexes >>>> >>>> LiveRatio's: >>>> INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) >>>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 >>>> (just-counted was 1.0). calculation took 85ms for 6236 columns >>>> INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) >>>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 >>>> (just-counted was 1.0). calculation took 8ms for 1 c
Re: Cassandra 1.0 hangs during GC
There is one peculiar thing about this problem When I copy the cluster to other 2 servers (much weaker servers in means of CPU and RAM) and run cassandra there, I see no massive CPU load and no huge GC times. Thus I suppose it has something to do with hardware or software installed on the machines. I plan to check software and hardware issues step-by-step to find that difference between servers which brings big problems to one of (otherwise identical) clusters. What I plan to compare between 'bad' cluster and 'good' cluster: - Configs, schemas, data etc: same - java version : same - RAM and CPU : 'bad' cluster has more - Ubuntu version - Networking - What else??? 26.07.2012, 10:25, "Nikolay Kоvshov" : > Sun Java 6 didn't help it at all > > Sar shows no special activity on the long GC times > > And I have remembered that I do a lot of get_range_slices requests, each time > on a range of 100, if this could be important. > > Still my GC times are growing incredible > > INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) > GC for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472 > INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) > GC for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472 > INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) > GC for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472 > INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) > GC for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472 > > 24.07.2012, 13:07, "Joost van de Wijgerd" : > >> You are better off using Sun Java 6 to run Cassandra. In the past >> there were issues reported on 7. Can you try running it on Sun Java 6? >> >> kind regards >> >> Joost >> >> On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov >> wrote: >>> 48 G of Ram on that machine, swap is not used. I will disable swap at all >>> just in case >>> I have 4 cassandra processes (parts of 4 different clusters), each >>> allocated 8 GB and using 4 of them >>>> java -version >>> java version "1.7.0" >>> Java(TM) SE Runtime Environment (build 1.7.0-b147) >>> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) >>> >>> 23.07.2012, 20:12, "Joost van de Wijgerd" : >>>> Howmuch memory do you have on the machine. Seems like you have 8G >>>> reserved for the Cassandra java process, If this is all the memory on >>>> the machine you might be swapping. Also which jvm do you use? >>>> >>>> kind regards >>>> >>>> Joost >>>> >>>> On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov >>>> wrote: >>>>> 21th I have mirgated to cassandra 1.1.2 but see no improvement >>>>> >>>>> cat /var/log/cassandra/Earth1.log | grep "GC for" >>>>> INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line >>>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line >>>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line >>>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line >>>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line >>>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is >>>>> 8464105472 >>>>> === Migrated from 1.0.0 to 1.1.2 >>>>> INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line >>>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line >>>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is >>>>> 8464105472 >>>>> INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line
Re: Cassandra 1.0 hangs during GC
- JNA is not installed on both machines 30.07.2012, 14:44, "Mateusz Korniak" : > On Monday 30 of July 2012, Nikolay Kоvshov wrote: > >> What I plan to compare between 'bad' cluster and 'good' cluster: >> >> - Configs, schemas, data etc: same >> - java version : same >> - RAM and CPU : 'bad' cluster has more >> - Ubuntu version >> - Networking >> - What else??? > > - JNA. If it's working and swapping is disabled for cassandra binary. > - IO subsystem. kernel, io schedulers, performance of IO, HDDs > performance/errors. > > -- > Mateusz Korniak
Re: Cassandra 1.0 hangs during GC
You mean using swap memory? I have total of 48G of RAM and Cassandra never used more than 2G, swap is disabled. But as I have little clues, I can give this a try. Is there any fresh instruction on running Cassandra with JNA ? 30.07.2012, 16:01, "Mateusz Korniak" : > On Monday 30 of July 2012, Nikolay Kоvshov wrote: > >> - JNA is not installed on both machines > > So your GC times may be strongly [1] affected by swapping. > IIRC, also snapshotting is more expensive and may trigger more swapping. > I would start with turning JNA mlockall on [2]. > > [1]: > Not sure if up to numbers you presented ( many seconds)... > > [2]: > INFO [main] 2012-07-27 12:18:27,135 CLibrary.java (line 109) JNA mlockall > successful > > -- > Mateusz Korniak
Re: Cassandra 1.0 hangs during GC
And the final solution http://unbxd.com/blog/2012/07/java-and-ksoftirqd-100-cpu-due-to-leap-second/ Doing $ date -s "`date`" solved the problem. 30.07.2012, 16:09, "Nikolay Kоvshov" : > You mean using swap memory? I have total of 48G of RAM and Cassandra never > used more than 2G, swap is disabled. > > But as I have little clues, I can give this a try. Is there any fresh > instruction on running Cassandra with JNA ? > > 30.07.2012, 16:01, "Mateusz Korniak" : >> On Monday 30 of July 2012, Nikolay Kоvshov wrote: >>> - JNA is not installed on both machines >> So your GC times may be strongly [1] affected by swapping. >> IIRC, also snapshotting is more expensive and may trigger more swapping. >> I would start with turning JNA mlockall on [2]. >> >> [1]: >> Not sure if up to numbers you presented ( many seconds)... >> >> [2]: >> INFO [main] 2012-07-27 12:18:27,135 CLibrary.java (line 109) JNA mlockall >> successful >> >> -- >> Mateusz Korniak
How to bring cluster to consistency
Hello everybody I have a very simple cluster containing 2 servers. Replication_factor = 2, Consistency_level of reads and writes = 1 10.111.1.141datacenter1 rack1 Up Normal 1.5 TB 100.00% vjpigMzv4KkX3x7z 10.111.1.142datacenter1 rack1 Up Normal 1.41 TB 0.00% 聶jpigMzv4KkX3x7z Note the size please. Say, server1 cassandra dies and I restart it later. Hinted_handoff = enabled, auto_bootstrap = true During that time server2 received reads and writes. I want changes to be copied to server1 when it joins the cluster. As I have replication_factor 2, I suppose each data piece should be stored on both servers. Auto_bootstrapping doesn't seem to work that way - changed data doesn't migrate. I run nodetool repair and it is always killed by OOM. What else can I do to bring cluster to consistency? Thank you in advance
Re: How to bring cluster to consistency
Sorry if this is a 4th copy of letter, but cassandra.apache.org constantly tells me that my message looks like spam... > 2/ both of your nodes seem to be using the same token? The output indicates > that 100% of your key range is assigned to 10.111.1.141 (and > therefore 10.111.1.142 holds replicas only) Well, I didn't assign anything. I just filled nodes with data, that's Cassandra itself who assigned that. I am trying to perform nodetool move now. Still I didn't understand from wiki what that means (keys assigned to servers). When both servers are up I can write to 1 and read from 2, or I can write to 2 and read from 1 and all works perfect. > 3/ maybe repair is being affected by above, but in my experience it can be > sensitive I believe it is affected by data size. At least some estimation about how much time and memory it could take would be of use. 13.02.2012, 17:19, "Dominic Williams" : > Hi Nikolay,Some points that may be useful: > 1/ auto_bootstrap = true is used for telling a new node to join the ring (the > cluster). It has nothing to do with hinted handoff > 2/ both of your nodes seem to be using the same token? The output indicates > that 100% of your key range is assigned to 10.111.1.141 (and > therefore 10.111.1.142 holds replicas only) > 3/ maybe repair is being affected by above, but in my experience it can be > sensitive > > On 13 February 2012 13:06, Nikolay Kоvshov wrote: >> Hello everybody >> >> I have a very simple cluster containing 2 servers. Replication_factor = 2, >> Consistency_level of reads and writes = 1 >> >> 10.111.1.141 datacenter1 rack1 Up Normal 1.5 TB >> 100.00% vjpigMzv4KkX3x7z >> 10.111.1.142 datacenter1 rack1 Up Normal 1.41 TB 0.00% >> 聶jpigMzv4KkX3x7z >> >> Note the size please. >> >> Say, server1 cassandra dies and I restart it later. Hinted_handoff = >> enabled, auto_bootstrap = true >> >> During that time server2 received reads and writes. I want changes to be >> copied to server1 when it joins the cluster. As I have replication_factor 2, >> I suppose each data piece should be stored on both servers. >> Auto_bootstrapping doesn't seem to work that way - changed data doesn't >> migrate. >> >> I run nodetool repair and it is always killed by OOM. What else can I do to >> bring cluster to consistency? >> >> Thank you in advance
OOM on heavy write load
I am using Cassandra 0.7.0 with following settings binary_memtable_throughput_in_mb: 64 in_memory_compaction_limit_in_mb: 64 keys_cached 1 million rows_cached 0 RAM for Cassandra 2 GB I run very simple test 1 Node with 4 HDDs (1 HDD - commitlog and caches, 3 HDDs - data) 1 KS => 1 CF => 1 Column I insert data (random key 64 bytes + value 64 bytes) at a maximum possible speed, trying to hit disk i/o, calculate speed and make sure Cassandra stays alive. It doesn't, unfortunately. After several hundreds millions of inserts Cassandra always does down by OOM. Getting it up again doesn't help - after inserting some new data it goes down again. By this time Cassandra goes to swap and has a lot of tasks pending. I am not inserting anything now and tasks slwly disappear, but it will take her weeks to do all of them. compaction type: Minor column family: Standard1 bytes compacted: 3661003227 bytes total in progress: 4176296448 pending tasks: 630 So, what am I (or Cassandra) doing wrong ? I dont want to get Cassandra crashed without means of repair on heavy load circumstances.
Re: OOM on heavy write load
I assume if I turn off swap it will just die earlier, no ? What is the mechanism of dying ? >From the link you provided # Row cache is too large, or is caching large rows my row_cache is 0 # The memtable sizes are too large for the amount of heap allocated to the JVM Is my memtable size too large ? I have made it less to surely fit the "magical formula" Trying to analyze heap dumps gives me the following: In one case diagram has 3 Memtables about 64 Mb each + 72 Mb "Thread" + 700 Mb "Unreachable objects" suspected threats: 7 instances of "org.apache.cassandra.db.Memtable", loaded by "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 456,292,912 (48.36%) bytes. 25,211 instances of "org.apache.cassandra.io.sstable.SSTableReader", loaded by "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 294,908,984 (31.26%) byte 72 instances of "java.lang.Thread", loaded by "" occupy 143,632,624 (15.22%) bytes. In other cases memory analyzer hangs trying to parse 2Gb dump 22.04.2011, 17:26, "Jonathan Ellis" ;: > (0) turn off swap > (1) > http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-are-dying-with-oom-errors > > On Fri, Apr 22, 2011 at 8:00 AM, Nikolay Kоvshov ;; > wrote: >> I am using Cassandra 0.7.0 with following settings >> >> binary_memtable_throughput_in_mb: 64 >> in_memory_compaction_limit_in_mb: 64 >> keys_cached 1 million >> rows_cached 0 >> >> RAM for Cassandra 2 GB >> >> I run very simple test >> >> 1 Node with 4 HDDs (1 HDD - commitlog and caches, 3 HDDs - data) >> 1 KS => 1 CF => 1 Column >> >> I insert data (random key 64 bytes + value 64 bytes) at a maximum possible >> speed, trying to hit disk i/o, calculate speed and make sure Cassandra stays >> alive. It doesn't, unfortunately. >> After several hundreds millions of inserts Cassandra always does down by >> OOM. Getting it up again doesn't help - after inserting some new data it >> goes down again. By this time Cassandra goes to swap and has a lot of tasks >> pending. I am not inserting anything now and tasks slwly disappear, but >> it will take her weeks to do all of them. >> >> compaction type: Minor >> column family: Standard1 >> bytes compacted: 3661003227 >> bytes total in progress: 4176296448 >> pending tasks: 630 >> >> So, what am I (or Cassandra) doing wrong ? I dont want to get Cassandra >> crashed without means of repair on heavy load circumstances. > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com
Re: OOM on heavy write load
I have set quite low memory consumption (see my configuration in first message) and give Cassandra 2.7 Gb of memory. I cache 1M of 64-bytes keys + 64 Mb memtables. I believe overhead can't be 500% or so ? memtable operations in millions = default 0.3 I see now very strange behaviour If i fill Cassandra with, say, 100 millions of 64B key + 64B data and after that I start inserting 64B key + 64 KB data, compaction queue immediately grows to hundreds and cassandra extremely slows down (it makes aroung 30-50 operations/sec), then starts to give timeout errors. But if I insert 64B key + 64 KB data from the very beginning, cassandra works fine and makes around 300 operations/sec even when the database exceeds 2-3 TB The behaviour is quite complex and i cannot predict the effect of my actions. What consequences I will have if I turn off compaction ? Can i read somewhere about what is compaction and why it so heavily depends on what and in which order i write into cassandra ? 26.04.2011, 00:08, "Shu Zhang" : > the way I measure actual memtable row sizes is this > > write X rows into a cassandra node > trigger GC > record heap usage > trigger compaction and GC > record heap savings and divide by X for actual cassandra memtable row size in > memory > > Similar process to measure per-key/per-row cache sizes for your data. To > understand your memtable row overhead size, you can do the above exercise > with very different data sizes. > > Also, you probably know this, but when setting your memory usage ceiling or > heap size, make sure to leave a few hundred MBs for GC. > > From: Shu Zhang [szh...@mediosystems.com] > Sent: Monday, April 25, 2011 12:55 PM > To: user@cassandra.apache.org > Subject: RE: OOM on heavy write load > > How large are your rows? binary_memtable_throughput_in_ > mb only tracks size of data, but there is an overhead associated with each > row on the order of magnitude of a few KBs. If your row data sizes are really > small then the overhead dominates the memory usage and > binary_memtable_throughput_in_ > mb end up not limiting your memory usage the way you'd expect. It's a good > idea to specify memtable_operations_in_millions in that case. If you're not > sure how big your data is compared to memtable overhead, it's a good idea to > specify both parameters to effectively put in a memory ceiling no matter > which dominates your actual memory usage. > > It could also be that your key cache is too big, you should measure your key > sizes and make sure you have enough memory to cache 1m keys (along with your > memtables). Finally if you have multiple keyspaces (for multiple > applications) on your cluster, they all share the total available heap, so > you have to account for that. > > There's no measure in cassandra to guard against OOM, you must configure > nodes such that the max memory usage on each node, that is max size all your > caches and memtables can potentially grow to, is less than your heap size. > > From: Nikolay Kоvshov [nkovs...@yandex.ru] > Sent: Monday, April 25, 2011 5:21 AM > To: user@cassandra.apache.org > Subject: Re: OOM on heavy write load > > I assume if I turn off swap it will just die earlier, no ? What is the > mechanism of dying ? > > From the link you provided > > # Row cache is too large, or is caching large rows > my row_cache is 0 > > # The memtable sizes are too large for the amount of heap allocated to the JVM > Is my memtable size too large ? I have made it less to surely fit the > "magical formula" > > Trying to analyze heap dumps gives me the following: > > In one case diagram has 3 Memtables about 64 Mb each + 72 Mb "Thread" + 700 > Mb "Unreachable objects" > > suspected threats: > 7 instances of "org.apache.cassandra.db.Memtable", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 456,292,912 > (48.36%) bytes. > 25,211 instances of "org.apache.cassandra.io.sstable.SSTableReader", loaded > by "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 294,908,984 > (31.26%) byte > 72 instances of "java.lang.Thread", loaded by "" occupy > 143,632,624 (15.22%) bytes. > > In other cases memory analyzer hangs trying to parse 2Gb dump > > 22.04.2011, 17:26, "Jonathan Ellis" ;;: > >> (0) turn off swap >> (1) >> http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-are-dying-with-oom-errors >> >> On Fri, Apr 22, 2011 at 8:00 AM, Nikolay Kоvshov ;;; >> wrote: >>> I am using Cassandra 0.7.0