Re: Cassandra Heap Size for data more than 1 TB

Alain RODRIGUEZ Fri, 04 Oct 2013 02:16:51 -0700

Here.

We have 1.5 TB running smooth. index_interval: 1024 and 8GB JVM. Default
bloomfilters.
The only pb we have is that We have 2TB SSD so they are almost full, C*
starts crashing. It looks like cassandra consider there is no more space
available, when there is still 500GB available (You're not supposed to use
50%+ disk space).


All operations are slower of course with these loads (Bootstrap, Repair,
cleanup, ...).

Yet I read on datastax website that MAX size is around 300 - 500 GB for C*
< 1.2.x and 3 to 5 GB after (under certain conditions, but taking profit of
off heap BF / caches etc.). Vnodes should also help reducing the time
needed for some operations.

Hope that helps somehow





2013/10/3 Michał Michalski <mich...@opera.com>

> Currently we have 480-520 GB of data per node, so it's not even close to
> 1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms of
> "everyday performance" - heap space is quite low, no GC issues etc. (to
> give you a comparison: when working on 1.1 and having ~300-400GB per node
> we had a huge problem with bloom filters and heap space, so we had to bump
> it to 12-16 GB; on 1.2 it's not an issue anymore).
>
> However, our main concern is the time that we'll need to rebuild broken
> node, so we are going to extend the cluster soon to avoid such problems and
> keep our nodes about 50% smaller.
>
> M.
>
>
> W dniu 03.10.2013 15:02, srmore pisze:
>
>  Thanks Mohit and Michael,
>> That's what I thought. I have tried all the avenues, will give ParNew a
>> try. With the 1.0.xx I have issues when data sizes go up, hopefully that
>> will not be the case with 1.2.
>>
>> Just curious, has anyone tried 1.2 with large data set, around 1 TB ?
>>
>>
>> Thanks !
>>
>>
>> On Thu, Oct 3, 2013 at 7:20 AM, Michał Michalski <mich...@opera.com>
>> wrote:
>>
>>  I was experimenting with 128 vs. 512 some time ago and I was unable to
>>> see
>>> any difference in terms of performance. I'd probably check 1024 too, but
>>> we
>>> migrated to 1.2 and heap space was not an issue anymore.
>>>
>>> M.
>>>
>>> W dniu 02.10.2013 16:32, srmore pisze:
>>>
>>>   I changed my index_interval from 128 to index_interval: 128 to 512,
>>> does
>>>
>>>> it
>>>> make sense to increase more than this ?
>>>>
>>>>
>>>> On Wed, Oct 2, 2013 at 9:30 AM, cem <cayiro...@gmail.com> wrote:
>>>>
>>>>   Have a look to index_interval.
>>>>
>>>>>
>>>>> Cem.
>>>>>
>>>>>
>>>>> On Wed, Oct 2, 2013 at 2:25 PM, srmore <comom...@gmail.com> wrote:
>>>>>
>>>>>   The version of Cassandra I am using is 1.0.11, we are migrating to
>>>>> 1.2.X
>>>>>
>>>>>> though. We had tuned bloom filters (0.1) and AFAIK making it lower
>>>>>> than
>>>>>> this won't matter.
>>>>>>
>>>>>> Thanks !
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia <
>>>>>> mohitanch...@gmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>   Which Cassandra version are you on? Essentially heap size is
>>>>>> function
>>>>>>
>>>>>>> of
>>>>>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata like
>>>>>>> bloom
>>>>>>> filters were moved off heap.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore <comom...@gmail.com> wrote:
>>>>>>>
>>>>>>>   Does anyone know what would roughly be the heap size for cassandra
>>>>>>>
>>>>>>>> with
>>>>>>>> 1TB of data ? We started with about 200 G and now on one of the
>>>>>>>> nodes
>>>>>>>> we
>>>>>>>> are already on 1 TB. We were using 8G of heap and that served us
>>>>>>>> well
>>>>>>>> up
>>>>>>>> until we reached 700 G where we started seeing failures and nodes
>>>>>>>> flipping.
>>>>>>>>
>>>>>>>> With 1 TB of data the node refuses to come back due to lack of
>>>>>>>> memory.
>>>>>>>> needless to say repairs and compactions takes a lot of time. We
>>>>>>>> upped
>>>>>>>> the
>>>>>>>> heap from 8 G to 12 G and suddenly everything started moving rapidly
>>>>>>>> i.e.
>>>>>>>> the repair tasks and the compaction tasks. But soon (in about 9-10
>>>>>>>> hrs) we
>>>>>>>> started seeing the same symptoms as we were seeing with 8 G.
>>>>>>>>
>>>>>>>> So my question is how do I determine what is the optimal size of
>>>>>>>> heap
>>>>>>>> for data around 1 TB ?
>>>>>>>>
>>>>>>>> Following are some of my JVM settings
>>>>>>>>
>>>>>>>> -Xms8G
>>>>>>>> -Xmx8G
>>>>>>>> -Xmn800m
>>>>>>>> -XX:NewSize=1200M
>>>>>>>> XX:MaxTenuringThreshold=2
>>>>>>>> -XX:SurvivorRatio=4
>>>>>>>>
>>>>>>>> Thanks !
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra Heap Size for data more than 1 TB

Reply via email to