Right now the biggest SST which I have is 210GB on a 3 TB disk, total disk consumed is around 50% on all nodes, I am using SCTS. Read and Write query latency is under 15ms. Full repair time is long but am sure when I switch to incremental repairs this would be taken care of. I am hitting the 50% disk issue. I recently ran the cleanup and backups aren't taking that much space.
On Thu, Apr 14, 2016 at 8:06 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > The four criteria I would suggest for evaluating node size: > > 1. Query latency. > 2. Query throughput/load > 3. Repair time - worst case, full repair, what you can least afford if it > happens at the worst time > 4. Expected growth over the next six to 18 months - you don't what to be > scrambling with latency, throughput, and repair problems when you bump into > a wall on capacity. 20% to 30% is a fair number. > > Alas, it is very difficult to determine how much spare capacity you have, > other than an artificial, synthetic load test: Try 30% more clients and > queries with 30% more (synthetic) data and see what happens to query > latency, total throughput, and repair time. Run such a test periodically > (monthly) to get a heads-up when load is getting closer to a wall. > > Incremental repair is great to streamline and optimize your day-to-day > operations, but focus attention on replacement of down nodes during times > of stress. > > > > -- Jack Krupansky > > On Thu, Apr 14, 2016 at 10:14 AM, Alain RODRIGUEZ <arodr...@gmail.com> > wrote: > >> Would adding nodes be the right way to start if I want to get the data >>> per node down >> >> >> Yes, if everything else is fine, the last and always available option to >> reduce the disk size per node is to add new nodes. Sometimes it is the >> first option considered as it is relatively quick and quite strait forward. >> >> Again, 50 % of free disk space is not a hard limit. To give you a rough >> idea, if the biggest sstable is 100 GB big and you still have 400 GB free, >> you will probably be good to go, excepted if 4 compaction of 100 GB trigger >> at the same time, filling up the disk. >> >> Now is the good time to think of a plan to handle the growth for you, but >> don't worry if data reaches 60%, it will probably not be a big deal. >> >> You can make sure that: >> >> - There are no snapshots, heap dumps or data not related with C* taking >> some space >> - The biggest sstables tombstone ratio are not too high (are tombstones >> are correctly evicted ?) >> - You are using compression (if you want too) >> >> Consider: >> >> - Adding TTLs to data you don't want to keep forever, shorten TTLs as >> much as allowed. >> - Migrating to C*3.0+ and take advantage of the new engine storage >> >> C*heers, >> ----------------------- >> Alain Rodriguez - al...@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> 2016-04-14 15:41 GMT+02:00 Aiman Parvaiz <ai...@flipagram.com>: >> >>> Thanks for the response Alain. I am using STCS and would like to take >>> some action as we would be hitting 50% disk space pretty soon. Would adding >>> nodes be the right way to start if I want to get the data per node down >>> otherwise can you or someone on the list please suggest the right way to go >>> about it. >>> >>> Thanks >>> >>> Sent from my iPhone >>> >>> On Apr 14, 2016, at 5:17 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: >>> >>> Hi, >>> >>> I seek advice in data size per node. Each of my node has close to 1 TB >>>> of data. I am not seeing any issues as of now but wanted to run it by you >>>> guys if this data size is pushing the limits in any manner and if I should >>>> be working on reducing data size per node. >>> >>> >>> There is no real limit to the data size other than 50% of the machine >>> disk space using STCS and 80 % if you are using LCS. Those are 'soft' >>> limits as it will depend on your biggest sstables size and the number of >>> concurrent compactions mainly, but to stay away from trouble, it is better >>> to keep things under control, below the limits mentioned above. >>> >>> I will me migrating to incremental repairs shortly and full repair as of >>>> now takes 20 hr/node. I am not seeing any issues with the nodes for now. >>>> >>> >>> As you noticed, you need to keep in mind that the larger the dataset is, >>> the longer operations will take. Repairs but also bootstrap or replace >>> a node, remove a node, any operation that require to stream data or read >>> it. Repair time can be mitigated by using incremental repairs indeed. >>> >>> I am running a 9 node C* 2.1.12 cluster. >>>> >>> >>> It should be quite safe to give incremental repair a try as many bugs >>> have been fixe in this version: >>> >>> FIX 2.1.12 - A lot of sstables using range repairs due to anticompaction >>> - incremental only >>> >>> https://issues.apache.org/jira/browse/CASSANDRA-10422 >>> >>> FIX 2.1.12 - repair hang when replica is down - incremental only >>> >>> https://issues.apache.org/jira/browse/CASSANDRA-10288 >>> >>> If you are using DTCS be aware of >>> https://issues.apache.org/jira/browse/CASSANDRA-11113 >>> >>> If using LCS, watch closely sstable and compactions pending counts. >>> >>> As a general comment, I would say that Cassandra has evolved to be able >>> to handle huge datasets (memory structures off-heap + increase of heap size >>> using G1GC, JBOD, vnodes, ...). Today Cassandra works just fine with big >>> dataset. I have seen clusters with 4+ TB nodes and other using a few GB per >>> node. It all depends on your requirements and your machines spec. If fast >>> operations are absolutely necessary, keep it small. If you want to use the >>> entire disk space (50/80% of total disk space max), go ahead as long as >>> other resources are fine (CPU, memory, disk throughput, ...). >>> >>> C*heers, >>> >>> ----------------------- >>> Alain Rodriguez - al...@thelastpickle.com >>> France >>> >>> The Last Pickle - Apache Cassandra Consulting >>> http://www.thelastpickle.com >>> >>> 2016-04-14 10:57 GMT+02:00 Aiman Parvaiz <ai...@flipagram.com>: >>> >>>> Hi all, >>>> I am running a 9 node C* 2.1.12 cluster. I seek advice in data size per >>>> node. Each of my node has close to 1 TB of data. I am not seeing any issues >>>> as of now but wanted to run it by you guys if this data size is pushing the >>>> limits in any manner and if I should be working on reducing data size per >>>> node. I will me migrating to incremental repairs shortly and full repair as >>>> of now takes 20 hr/node. I am not seeing any issues with the nodes for now. >>>> >>>> Thanks >>>> >>>> >>>> >>>> >>> >> > -- *Aiman Parvaiz* Lead Systems Architect ai...@flipagram.com cell: 213-300-6377 http://flipagram.com/apz