Re: Newsletter / Marketing: Re: Compaction Strategy

2018-09-21 Thread Ali Hubail
apache.org To user@cassandra.apache.org, cc Subject Newsletter / Marketing: Re: Compaction Strategy Hi Ali, Please find my answers 1) The table holds customer history data, where we receive the transaction data everyday for multiple vendors and batch job is executed which updates the data if the customer

Re: Compaction Strategy

2018-09-20 Thread rajasekhar kommineni
o user@cassandra.apache.org, cc Subject Re: Compaction Strategy Hello, Can any one respond to my questions. Is it a good idea to disable auto compaction and schedule it every 3 days. I am unable to control compaction and it is causing timeouts. Also will reducing or increasing compacti

Re: Compaction Strategy

2018-09-20 Thread Ali Hubail
g, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. rajasekhar kommineni 09/19/2018 04:44 PM Please respond to user@cassandra.apache.org To user@cassandra.apache.org, cc Subject Re: Compa

Re: Compaction Strategy

2018-09-19 Thread Nitan Kainth
It’s not recommended to disable compaction, you will end up with hundreds to thousands of sstables and increased read latency. If your data is immitable, means no update/deletes it will have least impact. Decreasing compaction throughput will release resources for application but don’t accumula

Re: Compaction Strategy

2018-09-19 Thread rajasekhar kommineni
Hello, Can any one respond to my questions. Is it a good idea to disable auto compaction and schedule it every 3 days. I am unable to control compaction and it is causing timeouts. Also will reducing or increasing compaction_throughput_mb_per_sec eliminate timeouts ? Thanks, > On Sep 17, 2

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
> > I wouldn't use TWCS if there's updates, you're going to risk having > data that's never deleted and really small sstables sticking around > forever. How do you risk having data sticking around forever when everything is TTL'd? If you use really large buckets, what's the point of TWCS? No one

Re: Compaction strategy for update heavy workload

2018-06-13 Thread Jonathan Haddad
I wouldn't use TWCS if there's updates, you're going to risk having data that's never deleted and really small sstables sticking around forever. If you use really large buckets, what's the point of TWCS? Honestly this is such a small workload you could easily use STCS or LCS and you'd likely neve

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
TWCS is probably still worth trying. If you mean updating old rows in TWCS "out of order updates" will only really mean you'll hit more SSTables on read. This might add a bit of complexity in your client if your bucketing partitions (not strictly necessary), but that's about it. As long as you're n

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Ah, clear then. SSD usage imposes a different bias in terms of costs;-) On Tue, Nov 25, 2014 at 9:48 PM, Nikolai Grigoriev wrote: > Andrei, > > Oh, yes, I have scanned the top of your previous email but overlooked the > last part. > > I am using SSDs so I prefer to put extra work to keep my syste

Re: Compaction Strategy guidance

2014-11-25 Thread Nikolai Grigoriev
Andrei, Oh, yes, I have scanned the top of your previous email but overlooked the last part. I am using SSDs so I prefer to put extra work to keep my system performing and save expensive disk space. So far I've been able to size the system more or less correctly so these LCS limitations do not ca

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Nikolai, Just in case you've missed my comment in the thread (guess you have) - increasing sstable size does nothing (in our case at least). That is, it's not worse but the load pattern is still the same - doing nothing most of the time. So, I switched to STCS and we will have to live with extra s

Re: Compaction Strategy guidance

2014-11-25 Thread Nikolai Grigoriev
Hi Jean-Armel, I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but there are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra 2.0.10. I have about 1,8Tb of data per node now in total, which falls into that range. As I said, it is really a problem with large amoun

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Yep, Marcus, I know. It's mainly a question of cost of those extra x2 disks, you know. Our "final" setup will be more like 30TB, so doubling it is still some cost. But i guess, we will have to live with it On Tue, Nov 25, 2014 at 1:26 PM, Marcus Eriksson wrote: > If you are that write-heavy you s

Re: Compaction Strategy guidance

2014-11-25 Thread Marcus Eriksson
If you are that write-heavy you should definitely go with STCS, LCS optimizes for reads by doing more compactions /Marcus On Tue, Nov 25, 2014 at 11:22 AM, Andrei Ivanov wrote: > Hi Jean-Armel, Nikolai, > > 1. Increasing sstable size doesn't work (well, I think, unless we > "overscale" - add mo

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Hi Jean-Armel, Nikolai, 1. Increasing sstable size doesn't work (well, I think, unless we "overscale" - add more nodes than really necessary, which is prohibitive for us in a way). Essentially there is no change. I gave up and will go for STCS;-( 2. We use 2.0.11 as of now 3. We are running on EC

Re: Compaction Strategy guidance

2014-11-25 Thread Jean-Armel Luce
Hi Andrei, Hi Nicolai, Which version of C* are you using ? There are some recommendations about the max storage per node : http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 "For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to handle 10x (3-5TB)". I have

Re: Compaction Strategy guidance

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev wrote: > One of the obvious recommendations I have received was to run more than > one instance of C* per host. Makes sense - it will reduce the amount of > data per node and will make better use of the resources. > This is usually a Bad Idea to

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
sure if this node will ever get back to normal life. And >> >> >>>> believe me >> >> >>>> - this is not because of I/O, I have SSDs everywhere and 16 >> >> >>>> physical >> >> >>>> cores. >> >> >>&

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
lose to 2x disk > space > >> >>>> on > >> >>>> EVERY disk in my JBOD configuration...this will kill the node > sooner > >> >>>> or > >> >>>> later. This is all because all sstables after bootstrap end at L0 > and > &g

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
otstrap end at L0 and >> >>>> then >> >>>> the process slowly slowly moves them to other levels. If you have >> >>>> write >> >>>> traffic to that CF then the number of sstables and L0 will grow >> >>

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
e > https://issues.apache.org/jira/browse/CASSANDRA-8301 > >>>> is implemented it may be better. > >>>> > >>>> > >>>> On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov > >>>> wrote: > >>>>> &g

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
ewhat similar C* load profile. Hence some comments >>>>> in addition Nikolai's answer. >>>>> 1. Fallback to STCS - you can disable it actually >>>>> 2. Based on our experience, if you have a lot of data per node, LCS >>>>> may work j

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
ill not be able >>>> to compact what it gets from old nodes. In your case, if you switch >>>> strategy the same thing may happen. This is all due to limitations >>>> mentioned by Nikolai. >>>> >>>> Andrei, >>>> >>>>

Re: Compaction Strategy guidance

2014-11-23 Thread Andrei Ivanov
may work just fine. That is, till the moment you decide to join >>>> another node - chances are that the newly added node will not be able >>>> to compact what it gets from old nodes. In your case, if you switch >>>> strategy the same thing may happen. This is all due to li

Re: Compaction Strategy guidance

2014-11-23 Thread Jean-Armel Luce
ai. >>> >>> Andrei, >>> >>> >>> On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G. >>> wrote: >>> > ABUSE >>> > >>> > >>> > >>> > YA NO QUIERO MAS MAILS SOY DE MEXICO >>> >

Re: Compaction Strategy guidance

2014-11-23 Thread Jean-Armel Luce
t; >> > >> > YA NO QUIERO MAS MAILS SOY DE MEXICO >> > >> > >> > >> > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] >> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. >> > Para: user@cassandra.apache.org >>

Re: Compaction Strategy guidance

2014-11-23 Thread Nikolai Grigoriev
gt; > > > > > > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] > > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. > > Para: user@cassandra.apache.org > > Asunto: Re: Compaction Strategy guidance > > Importancia: Alta > > > > > >

Re: Compaction Strategy guidance

2014-11-23 Thread Andrei Ivanov
t; ABUSE > > > > YA NO QUIERO MAS MAILS SOY DE MEXICO > > > > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. > Para: user@cassandra.apache.org > Asunto: Re: Compaction Strategy guidance > Importancia: A

RE: Compaction Strategy guidance

2014-11-22 Thread Servando Muñoz G .
ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at

Re: Compaction Strategy guidance

2014-11-22 Thread Nikolai Grigoriev
Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity) and

Re: compaction strategy

2011-05-11 Thread Jonathan Ellis
You are of course free to reduce the min per bucket to 2. The fundamental idea of sstables + compaction is to trade disk space for higher write performance. For most applications this is the right trade to make on modern hardware... I don't think you'll get very far trying to get the 2nd without t

Re: compaction strategy

2011-05-11 Thread Terje Marthinussen
> > > Not sure I follow you. 4 sstables is the minimum compaction look for > (by default). > If there is 30 sstables of ~20MB sitting there because compaction is > behind, you > will compact those 30 sstables together (unless there is not enough space > for > that and considering you haven't change

Re: compaction strategy

2011-05-10 Thread Sylvain Lebresne
On Tue, May 10, 2011 at 6:20 PM, Terje Marthinussen wrote: > >> Everyone may be well aware of that, but I'll still remark that a minor >> compaction >> will try to merge "as many 20MB sstables as it can" up to the max >> compaction >> threshold (which is configurable). So if you do accumulate some

Re: compaction strategy

2011-05-10 Thread Terje Marthinussen
> Everyone may be well aware of that, but I'll still remark that a minor > compaction > will try to merge "as many 20MB sstables as it can" up to the max > compaction > threshold (which is configurable). So if you do accumulate some newly > created > sstable at some point in time, the next minor co

Re: compaction strategy

2011-05-09 Thread Terje Marthinussen
Sorry, I was referring to the claim that "one big file" was a problem, not the non-overlapping part. If you never compact to a single file, you never get rid of all generations/duplicates. With non-overlapping files covering small enough token ranges, compacting down to one file is not a big issue

Re: compaction strategy

2011-05-09 Thread David Boxenhorn
If they each have their own copy of the data, then they are *not* non-overlapping! If you have non-overlapping SSTables (and you know the min/max keys), it's like having one big SSTable because you know exactly where each row is, and it becomes easy to merge a new SSTable in small batches, rather

Re: compaction strategy

2011-05-09 Thread Terje Marthinussen
Yes, agreed. I actually think cassandra has to. And if you do not go down to that single file, how do you avoid getting into a situation where you can very realistically end up with 4-5 big sstables each having its own copy of the same data massively increasing disk requirements? Terje On Mon,

Re: compaction strategy

2011-05-09 Thread David Boxenhorn
"I'm also not too much in favor of triggering major compactions, because it mostly have a nasty effect (create one huge sstable)." If that is the case, why can't major compactions create many, non-overlapping SSTables? In general, it seems to me that non-overlapping SSTables have all the advantag

Re: compaction strategy

2011-05-09 Thread Sylvain Lebresne
On Sat, May 7, 2011 at 7:20 PM, Terje Marthinussen wrote: > This is an all ssd system. I have no problems with read/write performance > due to I/O. > I do have a potential with the crazy explosion you can get in terms of disk > use if compaction cannot keep up. > > As things falls behind and you g

Re: compaction strategy

2011-05-07 Thread Peter Schuller
> It does not really make sense to me to go through all these minor merges > when a full compaction will do a much faster and better job. In a system heavily reliant on caching (platter drives, large data sizes, much larger than RAM) major compactions can be very detrimental to performance due to

Re: compaction strategy

2011-05-07 Thread Terje Marthinussen
This is an all ssd system. I have no problems with read/write performance due to I/O. I do have a potential with the crazy explosion you can get in terms of disk use if compaction cannot keep up. As things falls behind and you get many generations of data, yes, read performance gets a problem due

Re: compaction strategy

2011-05-07 Thread Peter Schuller
> If you are seeing 600 pending compaction tasks regularly you almost > definitely need more hardware. Note that pending compactions is pretty misleading and you can't really draw conclusions just based on the pending compactions number/graph. For example, standard behavior during e.g.a long repa

Re: compaction strategy

2011-05-07 Thread Edward Capriolo
On Sat, May 7, 2011 at 8:54 AM, Jonathan Ellis wrote: > On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen > wrote: >> 1. Would it make sense to make full compactions occur a bit more aggressive. > > I'd rather reduce the performance impact of being behind, than do more > full compactions: https:

Re: compaction strategy

2011-05-07 Thread Jonathan Ellis
On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen wrote: > 1. Would it make sense to make full compactions occur a bit more aggressive. I'd rather reduce the performance impact of being behind, than do more full compactions: https://issues.apache.org/jira/browse/CASSANDRA-2498 > 2. I > would th