Ah, clear then. SSD usage imposes a different bias in terms of costs;-)
On Tue, Nov 25, 2014 at 9:48 PM, Nikolai Grigoriev <ngrigor...@gmail.com> wrote: > Andrei, > > Oh, yes, I have scanned the top of your previous email but overlooked the > last part. > > I am using SSDs so I prefer to put extra work to keep my system performing > and save expensive disk space. So far I've been able to size the system more > or less correctly so these LCS limitations do not cause too much troubles. > But I do keep the CF "sharding" option as backup - for me it will be > relatively easy to implement it. > > > On Tue, Nov 25, 2014 at 1:25 PM, Andrei Ivanov <aiva...@iponweb.net> wrote: >> >> Nikolai, >> >> Just in case you've missed my comment in the thread (guess you have) - >> increasing sstable size does nothing (in our case at least). That is, >> it's not worse but the load pattern is still the same - doing nothing >> most of the time. So, I switched to STCS and we will have to live with >> extra storage cost - storage is way cheaper than cpu etc anyhow:-) >> >> On Tue, Nov 25, 2014 at 5:53 PM, Nikolai Grigoriev <ngrigor...@gmail.com> >> wrote: >> > Hi Jean-Armel, >> > >> > I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but >> > there >> > are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra >> > 2.0.10. >> > >> > I have about 1,8Tb of data per node now in total, which falls into that >> > range. >> > >> > As I said, it is really a problem with large amount of data in a single >> > CF, >> > not total amount of data. Quite often the nodes are idle yet having >> > quite a >> > bit of pending compactions. I have discussed it with other members of C* >> > community and DataStax guys and, they have confirmed my observation. >> > >> > I believe that increasing the sstable size won't help at all and >> > probably >> > will make the things worse - everything else being equal, of course. But >> > I >> > would like to hear from Andrei when he is done with his test. >> > >> > Regarding the last statement - yes, C* clearly likes many small servers >> > more >> > than fewer large ones. But it is all relative - and can be all >> > recalculated >> > to $$$ :) C* is all about partitioning of everything - storage, >> > traffic...Less data per node and more nodes give you lower latency, >> > lower >> > heap usage etc, etc. I think I have learned this with my project. >> > Somewhat >> > hard way but still, nothing is better than the personal experience :) >> > >> > On Tue, Nov 25, 2014 at 3:23 AM, Jean-Armel Luce <jaluc...@gmail.com> >> > wrote: >> >> >> >> Hi Andrei, Hi Nicolai, >> >> >> >> Which version of C* are you using ? >> >> >> >> There are some recommendations about the max storage per node : >> >> >> >> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 >> >> >> >> "For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to >> >> handle 10x >> >> (3-5TB)". >> >> >> >> I have the feeling that those recommendations are sensitive according >> >> many >> >> criteria such as : >> >> - your hardware >> >> - the compaction strategy >> >> - ... >> >> >> >> It looks that LCS lower those limitations. >> >> >> >> Increasing the size of sstables might help if you have enough CPU and >> >> you >> >> can put more load on your I/O system (@Andrei, I am interested by the >> >> results of your experimentation about large sstable files) >> >> >> >> From my point of view, there are some usage patterns where it is better >> >> to >> >> have many small servers than a few large servers. Probably, it is >> >> better to >> >> have many small servers if you need LCS for large tables. >> >> >> >> Just my 2 cents. >> >> >> >> Jean-Armel >> >> >> >> 2014-11-24 19:56 GMT+01:00 Robert Coli <rc...@eventbrite.com>: >> >>> >> >>> On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev >> >>> <ngrigor...@gmail.com> >> >>> wrote: >> >>>> >> >>>> One of the obvious recommendations I have received was to run more >> >>>> than >> >>>> one instance of C* per host. Makes sense - it will reduce the amount >> >>>> of data >> >>>> per node and will make better use of the resources. >> >>> >> >>> >> >>> This is usually a Bad Idea to do in production. >> >>> >> >>> =Rob >> >>> >> >> >> >> >> > >> > >> > >> > -- >> > Nikolai Grigoriev >> > (514) 772-5178 > > > > > -- > Nikolai Grigoriev > (514) 772-5178