Here. We have 1.5 TB running smooth. index_interval: 1024 and 8GB JVM. Default bloomfilters. The only pb we have is that We have 2TB SSD so they are almost full, C* starts crashing. It looks like cassandra consider there is no more space available, when there is still 500GB available (You're not supposed to use 50%+ disk space).
All operations are slower of course with these loads (Bootstrap, Repair, cleanup, ...). Yet I read on datastax website that MAX size is around 300 - 500 GB for C* < 1.2.x and 3 to 5 GB after (under certain conditions, but taking profit of off heap BF / caches etc.). Vnodes should also help reducing the time needed for some operations. Hope that helps somehow 2013/10/3 Michał Michalski <mich...@opera.com> > Currently we have 480-520 GB of data per node, so it's not even close to > 1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms of > "everyday performance" - heap space is quite low, no GC issues etc. (to > give you a comparison: when working on 1.1 and having ~300-400GB per node > we had a huge problem with bloom filters and heap space, so we had to bump > it to 12-16 GB; on 1.2 it's not an issue anymore). > > However, our main concern is the time that we'll need to rebuild broken > node, so we are going to extend the cluster soon to avoid such problems and > keep our nodes about 50% smaller. > > M. > > > W dniu 03.10.2013 15:02, srmore pisze: > > Thanks Mohit and Michael, >> That's what I thought. I have tried all the avenues, will give ParNew a >> try. With the 1.0.xx I have issues when data sizes go up, hopefully that >> will not be the case with 1.2. >> >> Just curious, has anyone tried 1.2 with large data set, around 1 TB ? >> >> >> Thanks ! >> >> >> On Thu, Oct 3, 2013 at 7:20 AM, Michał Michalski <mich...@opera.com> >> wrote: >> >> I was experimenting with 128 vs. 512 some time ago and I was unable to >>> see >>> any difference in terms of performance. I'd probably check 1024 too, but >>> we >>> migrated to 1.2 and heap space was not an issue anymore. >>> >>> M. >>> >>> W dniu 02.10.2013 16:32, srmore pisze: >>> >>> I changed my index_interval from 128 to index_interval: 128 to 512, >>> does >>> >>>> it >>>> make sense to increase more than this ? >>>> >>>> >>>> On Wed, Oct 2, 2013 at 9:30 AM, cem <cayiro...@gmail.com> wrote: >>>> >>>> Have a look to index_interval. >>>> >>>>> >>>>> Cem. >>>>> >>>>> >>>>> On Wed, Oct 2, 2013 at 2:25 PM, srmore <comom...@gmail.com> wrote: >>>>> >>>>> The version of Cassandra I am using is 1.0.11, we are migrating to >>>>> 1.2.X >>>>> >>>>>> though. We had tuned bloom filters (0.1) and AFAIK making it lower >>>>>> than >>>>>> this won't matter. >>>>>> >>>>>> Thanks ! >>>>>> >>>>>> >>>>>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia < >>>>>> mohitanch...@gmail.com >>>>>> >>>>>>> wrote: >>>>>>> >>>>>> >>>>>> Which Cassandra version are you on? Essentially heap size is >>>>>> function >>>>>> >>>>>>> of >>>>>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata like >>>>>>> bloom >>>>>>> filters were moved off heap. >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore <comom...@gmail.com> wrote: >>>>>>> >>>>>>> Does anyone know what would roughly be the heap size for cassandra >>>>>>> >>>>>>>> with >>>>>>>> 1TB of data ? We started with about 200 G and now on one of the >>>>>>>> nodes >>>>>>>> we >>>>>>>> are already on 1 TB. We were using 8G of heap and that served us >>>>>>>> well >>>>>>>> up >>>>>>>> until we reached 700 G where we started seeing failures and nodes >>>>>>>> flipping. >>>>>>>> >>>>>>>> With 1 TB of data the node refuses to come back due to lack of >>>>>>>> memory. >>>>>>>> needless to say repairs and compactions takes a lot of time. We >>>>>>>> upped >>>>>>>> the >>>>>>>> heap from 8 G to 12 G and suddenly everything started moving rapidly >>>>>>>> i.e. >>>>>>>> the repair tasks and the compaction tasks. But soon (in about 9-10 >>>>>>>> hrs) we >>>>>>>> started seeing the same symptoms as we were seeing with 8 G. >>>>>>>> >>>>>>>> So my question is how do I determine what is the optimal size of >>>>>>>> heap >>>>>>>> for data around 1 TB ? >>>>>>>> >>>>>>>> Following are some of my JVM settings >>>>>>>> >>>>>>>> -Xms8G >>>>>>>> -Xmx8G >>>>>>>> -Xmn800m >>>>>>>> -XX:NewSize=1200M >>>>>>>> XX:MaxTenuringThreshold=2 >>>>>>>> -XX:SurvivorRatio=4 >>>>>>>> >>>>>>>> Thanks ! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >