Cassandra 1.0 hangs during GC

2012-07-17 Thread Nikolay Kоvshov

This is a cluster of 2 nodes, each having 8G of operating memory, 
replicationfactor=2
Write/read pressure is quite low and almost never exceeds 10/second

>From time to time (2-3 times in a month) I see GC activity in logs and for 
>this time cassandra stops responding to requests which results in a timeout in 
>upper-layer application. Total time of unavailability can be over 5 minues 
>(like in the following case)

What can I do with that? Wiil it become much worse when my cluster grows up?

INFO [GossipTasks:1] 2012-07-16 13:10:50,055 Gossiper.java (line 736) 
InetAddress /10.220.50.9 is now dead.
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,056 GCInspector.java (line 123) GC 
for ParNew: 391383 ms for 1 collections, 2025808488 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,067 StatusLogger.java (line 50) 
Pool Name                    Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) 
ReadStage                         0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) 
RequestResponseStage              0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) 
ReadRepairStage                   0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) 
MutationStage                     0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,073 StatusLogger.java (line 65) 
ReplicateOnWriteStage             0         0         0
 INFO [GossipStage:1] 2012-07-16 13:10:50,074 Gossiper.java (line 722) 
InetAddress /10.220.50.9 is now UP
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) 
GossipStage                       1        59         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) 
AntiEntropyStage                  0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) 
MigrationStage                    0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) 
StreamStage                       0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) 
MemtablePostFlusher               0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) 
FlushWriter                       0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) 
MiscStage                         0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) 
InternalResponseStage             0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) 
HintedHandoff                     0         0         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 69) 
CompactionManager               n/a         0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 81) 
MessagingService                n/a       0,0
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 85) 
ColumnFamily                Memtable ops,data  Row cache size/cap  Key cache 
size/cap
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 88) 
Keyspace1.PSS                 36712,343842617                 0/0       
97995/100
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) 
Keyspace1.Standard1          128679,162567721                 0/0           
0/100
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) 
system.NodeIdInfo                         0,0                 0/0               
  0/1
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) 
system.IndexInfo                          0,0                 0/0               
  0/1
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) 
system.LocationInfo                       0,0                 0/0               
  1/2
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) 
system.Migrations                         0,0                 0/0               
  3/3
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) 
system.HintsColumnFamily                1,528                 0/0               
  0/1
 INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) 
system.Schema                             0,0                 0/0               
  2/3
 Завершение пересылаемого сообщения 


Re: Cassandra 1.0 hangs during GC

2012-07-23 Thread Nikolay Kоvshov
 21th I have mirgated to cassandra 1.1.2 but see no improvement 

cat /var/log/cassandra/Earth1.log | grep "GC for"
INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) GC 
for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) GC 
for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) GC 
for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) GC 
for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) GC 
for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 8464105472
=== Migrated from 1.0.0 to 1.1.2
INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) GC 
for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) GC 
for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) GC 
for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472
INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) GC 
for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472

Memory and yaml memory-related settings are default 
I do not do deletes
I have 2 CF's and no secondary indexes

LiveRatio's:
 INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) 
CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
was 1.0).  calculation took 85ms for 6236 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
(just-counted was 1.0).  calculation took 8ms for 1 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
was 1.0).  calculation took 99ms for 1094 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
(just-counted was 1.0).  calculation took 80ms for 242 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
was 1.0).  calculation took 505ms for 2678 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
was 1.0).  calculation took 776ms for 5236 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
(just-counted was 1.0).  calculation took 64ms for 389 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
was 1.0).  calculation took 1378ms for 8948 columns
 INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
(just-counted was 1.0).  calculation took 140ms for 1082 columns
 INFO [MemoryMeter:1] 2012-07-21 09:05:08,439 Memtable.java (line 213) 
CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 
2.5763038186160894 (just-counted was 2.5763038186160894).  calculation took 
8796ms for 102193 columns

18.07.2012, 07:51, "aaron morton" :
> Assuming all the memory and yaml settings default that does not sound 
> right. The first thought would be the memory meter not counting correctly...
> Do you do a lot of deletes ?
> Do you have a lot of CF's and/or secondary indexes ?
> Can you see log lines about the "liveRatio" for your cf's ?
> I would upgrade to 1.0.10 before getting too carried away though.
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 17/07/2012, at 8:14 PM, Nikolay Kоvshov wrote:
>
>> This is a cluster of 2 nodes, each having 8G of operating memory, 
>> replicationfactor=2
>> Write/read pressure is quite low and almost never exceeds 10/second
>>
>> From time to time (2-3 times in a month) I see GC activity in logs and for 
>> this time cassandra stops responding to requests which results in a timeout 
>> in upper-layer applicat

nodetool move causes summary load to grow

2012-07-23 Thread Nikolay Kоvshov
I have a testing cluster cassandra 1.1.2 with default memory and cache 
settings, 1 CF, 1 KS, RF = 2

This is an empty cluster

10.111.1.141datacenter1 rack1   Up Normal  43.04 KB100.00%  
   0   
10.111.1.142datacenter1 rack1   Up Normal  26.95 KB100.00%  
   85070591730234615865843651857942052864 

Then I fill it with 200 MB of random data

10.111.1.141datacenter1 rack1   Up Normal  153.34 MB   100.00%  
   0   
10.111.1.142datacenter1 rack1   Up Normal  76.94 MB100.00%  
   85070591730234615865843651857942052864

Though second token is something that your python script calculates for equal 
token distribution and keys were random, well, this is still looks reasonable

Now I start moving nodes. I wish them change places

nodetool -h d-st-n2 move 1
nodetool -h d-st-n2 cleanup
Here I expect for the second node to have a load of almost 0, but this does not 
happen

10.111.1.141datacenter1 rack1   Up Normal  195.53 MB   100.00%  
   0   
10.111.1.142datacenter1 rack1   Up Normal  249.82 MB   100.00%  
   1  

nodetool -h d-st-n1 move 85070591730234615865843651857942052864
nodetool -h d-st-n1 cleanup

Nodes did not change places, but the memory grows bigger 

10.111.1.142datacenter1 rack1   Up Normal  271.2 MB100.00%  
   1   
10.111.1.141datacenter1 rack1   Up Normal  195.53 MB   100.00%  
   85070591730234615865843651857942052864

nodetool -h d-st-n1 move z
nodetool -h d-st-n2 move a
nodetool -h d-st-n1 move 0
nodetool -h d-st-n2 move 85070591730234615865843651857942052864
nodetool -h d-st-n1 cleanup && nodetool -h d-st-n2 cleanup

Here I come to the original cluster, but now it is much bigger

10.111.1.141datacenter1 rack1   Up Normal  353.09 MB   100.00%  
   0   
10.111.1.142datacenter1 rack1   Up Normal  195.58 MB   100.00%  
   85070591730234615865843651857942052864

Could you help me understand the effect of nodetool move ? I see no logic in 
these results.


Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Nikolay Kоvshov
48 G of Ram on that machine, swap is not used. I will disable swap at all just 
in case
I have 4 cassandra processes (parts of 4 different clusters), each allocated 8 
GB and using 4 of them

>java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)




23.07.2012, 20:12, "Joost van de Wijgerd" :
> Howmuch memory do you have on the machine. Seems like you have 8G
> reserved for the Cassandra java process, If this is all the memory on
> the machine you might be swapping. Also which jvm do you use?
>
> kind regards
>
> Joost
>
> On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov  wrote:
>
>>   21th I have mirgated to cassandra 1.1.2 but see no improvement
>>
>>  cat /var/log/cassandra/Earth1.log | grep "GC for"
>>  INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) 
>> GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) 
>> GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) 
>> GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) 
>> GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
>> 8464105472
>>  INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) 
>> GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
>> 8464105472
>>  === Migrated from 1.0.0 to 1.1.2
>>  INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) 
>> GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) 
>> GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) 
>> GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472
>>  INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) 
>> GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472
>>
>>  Memory and yaml memory-related settings are default
>>  I do not do deletes
>>  I have 2 CF's and no secondary indexes
>>
>>  LiveRatio's:
>>   INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
>> was 1.0).  calculation took 85ms for 6236 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>> (just-counted was 1.0).  calculation took 8ms for 1 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
>> was 1.0).  calculation took 99ms for 1094 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>> (just-counted was 1.0).  calculation took 80ms for 242 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
>> was 1.0).  calculation took 505ms for 2678 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
>> was 1.0).  calculation took 776ms for 5236 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>> (just-counted was 1.0).  calculation took 64ms for 389 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
>> was 1.0).  calculation took 1378ms for 8948 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) 
>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>> (just-counted was 1.0).  calculation took 140ms for 1082 columns
>>   INFO [MemoryMeter:1] 2012-07-21 09:

Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Nikolay Kоvshov

I ran sar only recently after your advice and did not meet any huge GC-s on 
that server

At 08:14 there was a GC lasting 4.5 seconds, that's not five minutes of course, 
but also quite an unpleasant value;

Still I'm waiting for big GC values and will provide according sar logs.

07:25:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s 
pgscand/s pgsteal/s%vmeff
07:35:01 PM  0.00  4.34  9.02  0.00 13.61  0.00  
0.00  0.00  0.00
07:45:01 PM  0.00  5.17 20.47  0.00 25.77  0.00  
0.00  0.00  0.00
07:55:01 PM  0.00  4.66  8.63  0.00 18.69  0.00  
0.00  0.00  0.00
08:05:01 PM  0.00  8.11  8.84  0.00 14.37  0.00  
0.00  0.00  0.00
08:15:01 PM  0.00  5.19 21.65  0.00 25.94  0.00  
0.00  0.00  0.00


24.07.2012, 10:22, "Wojciech Meler" :
> Can you provide output from sar command for the time period when long
> GC occurred ?
>
> Regards,
> Wojciech Meler


Fwd: Cassandra 1.0 hangs during GC

2012-07-26 Thread Nikolay Kоvshov
Sun Java 6 didn't help it at all

Sar shows no special activity on the long GC times

And I have remembered that I do a lot of get_range_slices requests, each time 
on a range of 100, if this could be important.

Still my GC times are growing incredible

 INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) GC 
for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) GC 
for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) GC 
for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) GC 
for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472

24.07.2012, 13:07, "Joost van de Wijgerd" :

>  You are better off using Sun Java 6 to run Cassandra. In the past
>  there were issues reported on 7. Can you try running it on Sun Java 6?
>
>  kind regards
>
>  Joost
>
>  On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov  wrote:
>>   48 G of Ram on that machine, swap is not used. I will disable swap at all 
>> just in case
>>   I have 4 cassandra processes (parts of 4 different clusters), each 
>> allocated 8 GB and using 4 of them
>>>  java -version
>>   java version "1.7.0"
>>   Java(TM) SE Runtime Environment (build 1.7.0-b147)
>>   Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
>>
>>   23.07.2012, 20:12, "Joost van de Wijgerd" :
>>>   Howmuch memory do you have on the machine. Seems like you have 8G
>>>   reserved for the Cassandra java process, If this is all the memory on
>>>   the machine you might be swapping. Also which jvm do you use?
>>>
>>>   kind regards
>>>
>>>   Joost
>>>
>>>   On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov  
>>> wrote:
>>>> 21th I have mirgated to cassandra 1.1.2 but see no improvement
>>>>
>>>>    cat /var/log/cassandra/Earth1.log | grep "GC for"
>>>>    INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 
>>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 
>>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 
>>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 
>>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 
>>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
>>>> 8464105472
>>>>    === Migrated from 1.0.0 to 1.1.2
>>>>    INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 
>>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 
>>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 
>>>> 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 
>>>> 8464105472
>>>>    INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 
>>>> 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 
>>>> 8464105472
>>>>
>>>>    Memory and yaml memory-related settings are default
>>>>    I do not do deletes
>>>>    I have 2 CF's and no secondary indexes
>>>>
>>>>    LiveRatio's:
>>>> INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 
>>>> 177) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
>>>> (just-counted was 1.0).  calculation took 85ms for 6236 columns
>>>> INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
>>>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>>>> (just-counted was 1.0).  calc

Re: Cassandra 1.0 hangs during GC

2012-07-26 Thread Nikolay Kоvshov
Sun Java 6 didn't help it at all

Sar shows no special activity on the long GC times

And I have remembered that I do a lot of get_range_slices requests, each time 
on a range of 100, if this could be important.

Still my GC times are growing incredible

 INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) GC 
for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) GC 
for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) GC 
for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472
 INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) GC 
for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472


24.07.2012, 13:07, "Joost van de Wijgerd" :
> You are better off using Sun Java 6 to run Cassandra. In the past
> there were issues reported on 7. Can you try running it on Sun Java 6?
>
> kind regards
>
> Joost
>
> On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov  wrote:
>
>>  48 G of Ram on that machine, swap is not used. I will disable swap at all 
>> just in case
>>  I have 4 cassandra processes (parts of 4 different clusters), each 
>> allocated 8 GB and using 4 of them
>>> java -version
>>  java version "1.7.0"
>>  Java(TM) SE Runtime Environment (build 1.7.0-b147)
>>  Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
>>
>>  23.07.2012, 20:12, "Joost van de Wijgerd" :
>>>  Howmuch memory do you have on the machine. Seems like you have 8G
>>>  reserved for the Cassandra java process, If this is all the memory on
>>>  the machine you might be swapping. Also which jvm do you use?
>>>
>>>  kind regards
>>>
>>>  Joost
>>>
>>>  On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov  
>>> wrote:
>>>>    21th I have mirgated to cassandra 1.1.2 but see no improvement
>>>>
>>>>   cat /var/log/cassandra/Earth1.log | grep "GC for"
>>>>   INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 
>>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 
>>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 
>>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 
>>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 
>>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
>>>> 8464105472
>>>>   === Migrated from 1.0.0 to 1.1.2
>>>>   INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 
>>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 
>>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 
>>>> 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 
>>>> 8464105472
>>>>   INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 
>>>> 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 
>>>> 8464105472
>>>>
>>>>   Memory and yaml memory-related settings are default
>>>>   I do not do deletes
>>>>   I have 2 CF's and no secondary indexes
>>>>
>>>>   LiveRatio's:
>>>>    INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) 
>>>> CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
>>>> (just-counted was 1.0).  calculation took 85ms for 6236 columns
>>>>    INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
>>>> CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
>>>> (just-counted was 1.0).  calculation took 8ms for 1 c

Re: Cassandra 1.0 hangs during GC

2012-07-30 Thread Nikolay Kоvshov
There is one peculiar thing about this problem

When I copy the cluster to other 2 servers (much weaker servers in means of CPU 
and RAM) and run cassandra there, I see no massive CPU load and no huge GC 
times. Thus I suppose it has something to do with hardware or software 
installed on the machines. I plan to check software and hardware issues 
step-by-step to find that difference between servers which brings big problems 
to one of (otherwise identical) clusters. 

What I plan to compare between 'bad' cluster and 'good' cluster:

- Configs, schemas, data etc: same
- java version : same
- RAM and CPU : 'bad' cluster has more
- Ubuntu version
- Networking
- What else??? 







26.07.2012, 10:25, "Nikolay Kоvshov" :
> Sun Java 6 didn't help it at all
>
> Sar shows no special activity on the long GC times
>
> And I have remembered that I do a lot of get_range_slices requests, each time 
> on a range of 100, if this could be important.
>
> Still my GC times are growing incredible
>
>  INFO [ScheduledTasks:1] 2012-07-25 12:17:04,029 GCInspector.java (line 122) 
> GC for ParNew: 497253 ms for 1 collections, 114036128 used; max is 8464105472
>  INFO [ScheduledTasks:1] 2012-07-25 13:26:21,282 GCInspector.java (line 122) 
> GC for ParNew: 149508 ms for 1 collections, 87015440 used; max is 8464105472
>  INFO [ScheduledTasks:1] 2012-07-25 18:39:44,654 GCInspector.java (line 122) 
> GC for ParNew: 83584 ms for 1 collections, 54368032 used; max is 8464105472
>  INFO [ScheduledTasks:1] 2012-07-25 22:45:42,448 GCInspector.java (line 122) 
> GC for ParNew: 48454 ms for 1 collections, 44072488 used; max is 8464105472
>
> 24.07.2012, 13:07, "Joost van de Wijgerd" :
>
>>  You are better off using Sun Java 6 to run Cassandra. In the past
>>  there were issues reported on 7. Can you try running it on Sun Java 6?
>>
>>  kind regards
>>
>>  Joost
>>
>>  On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov  
>> wrote:
>>>   48 G of Ram on that machine, swap is not used. I will disable swap at all 
>>> just in case
>>>   I have 4 cassandra processes (parts of 4 different clusters), each 
>>> allocated 8 GB and using 4 of them
>>>>  java -version
>>>   java version "1.7.0"
>>>   Java(TM) SE Runtime Environment (build 1.7.0-b147)
>>>   Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
>>>
>>>   23.07.2012, 20:12, "Joost van de Wijgerd" :
>>>>   Howmuch memory do you have on the machine. Seems like you have 8G
>>>>   reserved for the Cassandra java process, If this is all the memory on
>>>>   the machine you might be swapping. Also which jvm do you use?
>>>>
>>>>   kind regards
>>>>
>>>>   Joost
>>>>
>>>>   On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov  
>>>> wrote:
>>>>> 21th I have mirgated to cassandra 1.1.2 but see no improvement
>>>>>
>>>>>    cat /var/log/cassandra/Earth1.log | grep "GC for"
>>>>>    INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 
>>>>> 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 
>>>>> 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 
>>>>> 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 
>>>>> 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 
>>>>> 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
>>>>> 8464105472
>>>>>    === Migrated from 1.0.0 to 1.1.2
>>>>>    INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 
>>>>> 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 
>>>>> 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 
>>>>> 8464105472
>>>>>    INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line

Re: Cassandra 1.0 hangs during GC

2012-07-30 Thread Nikolay Kоvshov

-  JNA is not installed on both machines

30.07.2012, 14:44, "Mateusz Korniak" :
> On Monday 30 of July 2012, Nikolay Kоvshov wrote:
>
>>  What I plan to compare between 'bad' cluster and 'good' cluster:
>>
>>  - Configs, schemas, data etc: same
>>  - java version : same
>>  - RAM and CPU : 'bad' cluster has more
>>  - Ubuntu version
>>  - Networking
>>  - What else???
>
> - JNA. If it's working and swapping is disabled for cassandra binary.
> - IO subsystem. kernel, io schedulers, performance of IO, HDDs
> performance/errors.
>
> --
> Mateusz Korniak


Re: Cassandra 1.0 hangs during GC

2012-07-30 Thread Nikolay Kоvshov
You mean using swap memory? I have total of 48G of RAM and Cassandra never used 
more than 2G, swap is disabled.

But as I have little clues, I can give this a try. Is there any fresh 
instruction on running Cassandra with JNA ?

30.07.2012, 16:01, "Mateusz Korniak" :
> On Monday 30 of July 2012, Nikolay Kоvshov wrote:
>
>>  -  JNA is not installed on both machines
>
> So your GC times may be strongly [1] affected by swapping.
> IIRC, also snapshotting is more expensive and may trigger more swapping.
> I would start with turning JNA mlockall on [2].
>
> [1]:
> Not sure if up to numbers you presented ( many seconds)...
>
> [2]:
> INFO [main] 2012-07-27 12:18:27,135 CLibrary.java (line 109) JNA mlockall
> successful
>
> --
> Mateusz Korniak


Re: Cassandra 1.0 hangs during GC

2012-08-01 Thread Nikolay Kоvshov
And the final solution

http://unbxd.com/blog/2012/07/java-and-ksoftirqd-100-cpu-due-to-leap-second/

Doing $ date -s "`date`" solved the problem. 


30.07.2012, 16:09, "Nikolay Kоvshov" :

>  You mean using swap memory? I have total of 48G of RAM and Cassandra never 
> used more than 2G, swap is disabled.
>
>  But as I have little clues, I can give this a try. Is there any fresh 
> instruction on running Cassandra with JNA ?
>
>  30.07.2012, 16:01, "Mateusz Korniak" :
>>   On Monday 30 of July 2012, Nikolay Kоvshov wrote:
>>>    -  JNA is not installed on both machines
>>   So your GC times may be strongly [1] affected by swapping.
>>   IIRC, also snapshotting is more expensive and may trigger more swapping.
>>   I would start with turning JNA mlockall on [2].
>>
>>   [1]:
>>   Not sure if up to numbers you presented ( many seconds)...
>>
>>   [2]:
>>   INFO [main] 2012-07-27 12:18:27,135 CLibrary.java (line 109) JNA mlockall
>>   successful
>>
>>   --
>>   Mateusz Korniak


How to bring cluster to consistency

2012-02-13 Thread Nikolay Kоvshov
Hello everybody

I have a very simple cluster containing 2 servers. Replication_factor = 2, 
Consistency_level of reads and writes = 1

10.111.1.141datacenter1 rack1   Up Normal  1.5 TB  100.00% 
vjpigMzv4KkX3x7z
10.111.1.142datacenter1 rack1   Up Normal  1.41 TB 0.00%   
聶jpigMzv4KkX3x7z

Note the size please.

Say, server1 cassandra dies and I restart it later. Hinted_handoff = enabled, 
auto_bootstrap = true

During that time server2 received reads and writes. I want changes to be copied 
to server1 when it joins the cluster. As I have replication_factor 2, I suppose 
each data piece should be stored on both servers. Auto_bootstrapping doesn't 
seem to work that way - changed data doesn't migrate.

I run nodetool repair and it is always killed by OOM. What else can I do to 
bring cluster to consistency? 

Thank you in advance


Re: How to bring cluster to consistency

2012-02-13 Thread Nikolay Kоvshov
Sorry if this is a 4th copy of letter, but cassandra.apache.org constantly 
tells me that my message looks like spam...

> 2/ both of your nodes seem to be using the same token? The output indicates 
> that 100% of your key range is assigned to 10.111.1.141 (and 
> therefore 10.111.1.142 holds replicas only)

Well, I didn't assign anything. I just filled nodes with data, that's Cassandra 
itself who assigned that. I am trying to perform nodetool move now. Still I 
didn't understand from wiki what that means (keys assigned to servers). When 
both servers are up I can write to 1 and read from 2, or I can write to 2 and 
read from 1 and all works perfect.

> 3/ maybe repair is being affected by above, but in my experience it can be 
> sensitive

I believe it is affected by data size. At least some estimation about how much 
time and memory it could take would be of use.

13.02.2012, 17:19, "Dominic Williams" :
> Hi Nikolay,Some points that may be useful:
> 1/ auto_bootstrap = true is used for telling a new node to join the ring (the 
> cluster). It has nothing to do with hinted handoff
> 2/ both of your nodes seem to be using the same token? The output indicates 
> that 100% of your key range is assigned to 10.111.1.141 (and 
> therefore 10.111.1.142 holds replicas only)
> 3/ maybe repair is being affected by above, but in my experience it can be 
> sensitive
>
> On 13 February 2012 13:06, Nikolay Kоvshov  wrote:
>> Hello everybody
>>
>> I have a very simple cluster containing 2 servers. Replication_factor = 2, 
>> Consistency_level of reads and writes = 1
>>
>> 10.111.1.141    datacenter1 rack1       Up     Normal  1.5 TB          
>> 100.00% vjpigMzv4KkX3x7z
>> 10.111.1.142    datacenter1 rack1       Up     Normal  1.41 TB         0.00% 
>>   聶jpigMzv4KkX3x7z
>>
>> Note the size please.
>>
>> Say, server1 cassandra dies and I restart it later. Hinted_handoff = 
>> enabled, auto_bootstrap = true
>>
>> During that time server2 received reads and writes. I want changes to be 
>> copied to server1 when it joins the cluster. As I have replication_factor 2, 
>> I suppose each data piece should be stored on both servers. 
>> Auto_bootstrapping doesn't seem to work that way - changed data doesn't 
>> migrate.
>>
>> I run nodetool repair and it is always killed by OOM. What else can I do to 
>> bring cluster to consistency?
>>
>> Thank you in advance


OOM on heavy write load

2011-04-22 Thread Nikolay Kоvshov
I am using Cassandra 0.7.0 with following settings

binary_memtable_throughput_in_mb: 64
in_memory_compaction_limit_in_mb: 64
keys_cached 1 million
rows_cached 0

RAM for Cassandra 2 GB

I run very simple test

1 Node with 4 HDDs (1 HDD - commitlog and caches, 3 HDDs - data)
1 KS => 1 CF => 1 Column

I insert data (random key 64 bytes + value 64 bytes) at a maximum possible 
speed, trying to hit disk i/o, calculate speed and make sure Cassandra stays 
alive. It doesn't, unfortunately.
After several hundreds millions of inserts Cassandra always does down by OOM. 
Getting it up again doesn't help - after inserting some new data it goes down 
again. By this time Cassandra goes to swap and has a lot of tasks pending. I am 
not inserting anything now and tasks slwly disappear, but it will take her 
weeks to do all of them.

compaction type: Minor
column family: Standard1
bytes compacted: 3661003227
bytes total in progress: 4176296448
pending tasks: 630

So, what am I (or Cassandra) doing wrong ? I dont want to get Cassandra crashed 
without means of repair on heavy load circumstances.


Re: OOM on heavy write load

2011-04-25 Thread Nikolay Kоvshov

I assume if I turn off swap it will just die earlier, no ? What is the 
mechanism of dying ?

>From the link you provided

# Row cache is too large, or is caching large rows
my row_cache is 0

# The memtable sizes are too large for the amount of heap allocated to the JVM
Is my memtable size too large ? I have made it less to surely fit the "magical 
formula"

Trying to analyze heap dumps gives me the following:

In one case diagram has 3 Memtables about 64 Mb each + 72 Mb "Thread" + 700 Mb 
"Unreachable objects"

suspected threats:
7 instances of "org.apache.cassandra.db.Memtable", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 456,292,912 (48.36%) 
bytes.
25,211 instances of "org.apache.cassandra.io.sstable.SSTableReader", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 294,908,984 (31.26%) 
byte
72 instances of "java.lang.Thread", loaded by "" occupy 
143,632,624 (15.22%) bytes. 


In other cases memory analyzer hangs trying to parse 2Gb dump

 

22.04.2011, 17:26, "Jonathan Ellis" ;:

>  (0) turn off swap
>  (1) 
> http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-are-dying-with-oom-errors
>
>  On Fri, Apr 22, 2011 at 8:00 AM, Nikolay Kоvshov ;; 
> wrote:
>>   I am using Cassandra 0.7.0 with following settings
>>
>>   binary_memtable_throughput_in_mb: 64
>>   in_memory_compaction_limit_in_mb: 64
>>   keys_cached 1 million
>>   rows_cached 0
>>
>>   RAM for Cassandra 2 GB
>>
>>   I run very simple test
>>
>>   1 Node with 4 HDDs (1 HDD - commitlog and caches, 3 HDDs - data)
>>   1 KS => 1 CF => 1 Column
>>
>>   I insert data (random key 64 bytes + value 64 bytes) at a maximum possible 
>> speed, trying to hit disk i/o, calculate speed and make sure Cassandra stays 
>> alive. It doesn't, unfortunately.
>>   After several hundreds millions of inserts Cassandra always does down by 
>> OOM. Getting it up again doesn't help - after inserting some new data it 
>> goes down again. By this time Cassandra goes to swap and has a lot of tasks 
>> pending. I am not inserting anything now and tasks slwly disappear, but 
>> it will take her weeks to do all of them.
>>
>>   compaction type: Minor
>>   column family: Standard1
>>   bytes compacted: 3661003227
>>   bytes total in progress: 4176296448
>>   pending tasks: 630
>>
>>   So, what am I (or Cassandra) doing wrong ? I dont want to get Cassandra 
>> crashed without means of repair on heavy load circumstances.
>  --
>  Jonathan Ellis
>  Project Chair, Apache Cassandra
>  co-founder of DataStax, the source for professional Cassandra support
>  http://www.datastax.com


Re: OOM on heavy write load

2011-04-27 Thread Nikolay Kоvshov
I have set quite low memory consumption (see my configuration in first message) 
and give Cassandra 2.7 Gb of memory. 
I cache 1M of 64-bytes keys + 64 Mb memtables. I believe overhead can't be 500% 
or so ?

memtable operations in millions = default 0.3 

I see now very strange behaviour

If i fill Cassandra with, say, 100 millions of 64B key + 64B data and after 
that I start inserting 64B key + 64 KB data, compaction queue immediately grows 
to hundreds and cassandra extremely slows down (it makes aroung 30-50 
operations/sec), then starts to give timeout errors. 

But if I insert 64B key + 64 KB data from the very beginning, cassandra works 
fine and makes around 300 operations/sec even when the database exceeds 2-3 TB

The behaviour is quite complex and i cannot predict the effect of my actions. 
What consequences I will have if I turn off compaction ? Can i read somewhere 
about what is compaction and why it so heavily depends on what and in which 
order i write into cassandra ?

26.04.2011, 00:08, "Shu Zhang" :
> the way I measure actual memtable row sizes is this
>
> write X rows into a cassandra node
> trigger GC
> record heap usage
> trigger compaction and GC
> record heap savings and divide by X for actual cassandra memtable row size in 
> memory
>
> Similar process to measure per-key/per-row cache sizes for your data. To 
> understand your memtable row overhead size, you can do the above exercise 
> with very different data sizes.
>
> Also, you probably know this, but when setting your memory usage ceiling or 
> heap size, make sure to leave a few hundred MBs for GC.
> 
> From: Shu Zhang [szh...@mediosystems.com]
> Sent: Monday, April 25, 2011 12:55 PM
> To: user@cassandra.apache.org
> Subject: RE: OOM on heavy write load
>
> How large are your rows? binary_memtable_throughput_in_
> mb only tracks size of data, but there is an overhead associated with each 
> row on the order of magnitude of a few KBs. If your row data sizes are really 
> small then the overhead dominates the memory usage and 
> binary_memtable_throughput_in_
> mb end up not limiting your memory usage the way you'd expect. It's a good 
> idea to specify memtable_operations_in_millions in that case. If you're not 
> sure how big your data is compared to memtable overhead, it's a good idea to 
> specify both parameters to effectively put in a memory ceiling no matter 
> which dominates your actual memory usage.
>
> It could also be that your key cache is too big, you should measure your key 
> sizes and make sure you have enough memory to cache 1m keys (along with your 
> memtables). Finally if you have multiple keyspaces (for multiple 
> applications) on your cluster, they all share the total available heap, so 
> you have to account for that.
>
> There's no measure in cassandra to guard against OOM, you must configure 
> nodes such that the max memory usage on each node, that is max size all your 
> caches and memtables can potentially grow to, is less than your heap size.
> 
> From: Nikolay Kоvshov [nkovs...@yandex.ru]
> Sent: Monday, April 25, 2011 5:21 AM
> To: user@cassandra.apache.org
> Subject: Re: OOM on heavy write load
>
> I assume if I turn off swap it will just die earlier, no ? What is the 
> mechanism of dying ?
>
> From the link you provided
>
> # Row cache is too large, or is caching large rows
> my row_cache is 0
>
> # The memtable sizes are too large for the amount of heap allocated to the JVM
> Is my memtable size too large ? I have made it less to surely fit the 
> "magical formula"
>
> Trying to analyze heap dumps gives me the following:
>
> In one case diagram has 3 Memtables about 64 Mb each + 72 Mb "Thread" + 700 
> Mb "Unreachable objects"
>
> suspected threats:
> 7 instances of "org.apache.cassandra.db.Memtable", loaded by 
> "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 456,292,912 
> (48.36%) bytes.
> 25,211 instances of "org.apache.cassandra.io.sstable.SSTableReader", loaded 
> by "sun.misc.Launcher$AppClassLoader @ 0x7f29f4992d68" occupy 294,908,984 
> (31.26%) byte
> 72 instances of "java.lang.Thread", loaded by "" occupy 
> 143,632,624 (15.22%) bytes.
>
> In other cases memory analyzer hangs trying to parse 2Gb dump
>
> 22.04.2011, 17:26, "Jonathan Ellis" ;;:
>
>>   (0) turn off swap
>>   (1) 
>> http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-are-dying-with-oom-errors
>>
>>   On Fri, Apr 22, 2011 at 8:00 AM, Nikolay Kоvshov ;;; 
>> wrote:
>>>    I am using Cassandra 0.7.0