It depends on a lot of things: schema size, caches, work load etc. 

If your are just starting out I would recommend using a machine with 8gb or 
16gb total ram. By default cassandra will take about 4gb or 8gb (respectively) 
for the JVM.

Once you have a feel for how things work you should be able to estimate the 
resources your application will need. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/04/2012, at 2:19 AM, Vasileios Vlachos wrote:

> Hello Aaron,
> 
> Thank you for getting back to me.
> 
> I will change to m1.large first to see how long it will take Cassandra node 
> to die (if at all). If again not happy I will try more memory. I just want to 
> test it step by step and see what the differences are. I will also change the 
> cassandra-env file back to defaults.
> 
> Is there an absolute minimum requirement for Cassandra in terms of memory? I 
> might be wrong, but from my understanding we shouldn't have any problems 
> given the amount of data we store per day (currently approximately 2-2.5G / 
> day).
> 
> Thank you in advance,
> 
> Bill
> 
> 
> On Wed, Apr 11, 2012 at 7:33 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> 'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1) according to our 
>> nodes' specification. We also changed the 'MAX_HEAP_SIZE' to 2G and the 
>> 'HEAP_NEWSIZE' to 200M (we think the second is related to the Garbage 
>> Collection). 
> It's best to leave the default settings unless you know what you are doing 
> here. 
> 
>> In case you find this useful, swap is off and unevictable memory seems to be 
>> very high on all 3 servers (2.3GB, we usually observe the amount of 
>> unevictable memory on other Linux servers of around 0-16KB)
> Cassandra locks the java memory so it cannot be swapped out. 
> 
>> The problem is that the node we hit from our thrift interface dies regularly 
>> (approximately after we store 2-2.5G of data). Error message: 
>> OutOfMemoryError: Java Heap Space and according to the log it in fact used 
>> all of the allocated memory.
> The easiest solution will be to use a larger EC2 instance. 
> 
> People normally use an m1.xlarge with 16Gb of ram (you would also try an 
> m1.large).
> 
> If you are still experimenting I would suggest using the larger instances so 
> you can make some progress. Once you have a feel for how things work you can 
> then try to match the instances to your budget.
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 11/04/2012, at 1:54 AM, Vasileios Vlachos wrote:
> 
>> Hello,
>> 
>> We are experimenting a bit with Cassandra lately (version 1.0.7) and we seem 
>> to have some problems with memory. We use EC2 as our test environment and we 
>> have three nodes with 3.7G of memory and 1 core @ 2.4G, all running Ubuntu 
>> server 11.10. 
>> 
>> The problem is that the node we hit from our thrift interface dies regularly 
>> (approximately after we store 2-2.5G of data). Error message: 
>> OutOfMemoryError: Java Heap Space and according to the log it in fact used 
>> all of the allocated memory.
>> 
>> The nodes are under relatively constant load and store about 2000-4000 row 
>> keys a minute, which are batched through the Trift interface in 10-30 row 
>> keys at once (with about 50 columns each). The number of reads is very low 
>> with around 1000-2000 a day and only requesting the data of a single row 
>> key. The is currently only one used column family.
>> 
>> The initial thought was that something was wrong in the cassandra-env.sh 
>> file. So, we specified the variables 'system_memory_in_mb' (3760) and the 
>> 'system_cpu_cores' (1) according to our nodes' specification. We also 
>> changed the 'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think 
>> the second is related to the Garbage Collection). Unfortunately, that did 
>> not solve the issue and the node we hit via thrift keeps on dying regularly.
>> 
>> In case you find this useful, swap is off and unevictable memory seems to be 
>> very high on all 3 servers (2.3GB, we usually observe the amount of 
>> unevictable memory on other Linux servers of around 0-16KB) (We are not 
>> quite sure how the unevictable memory ties into Cassandra, its just 
>> something we observed while looking into the problem). The CPU is pretty 
>> much idle the entire time. The heap memory is clearly being reduced once in 
>> a while according to nodetool, but obviously grows over the limit as time 
>> goes by.
>> 
>> Any ideas? Thanks in advance.
>> 
>> Bill
> 
> 

Reply via email to