apache.org
To
user@cassandra.apache.org,
cc
Subject
Newsletter / Marketing: Re: Compaction Strategy
Hi Ali,
Please find my answers
1) The table holds customer history data, where we receive the transaction
data everyday for multiple vendors and batch job is executed which updates
the data if the customer
o
user@cassandra.apache.org,
cc
Subject
Re:
Compaction Strategy
Hello,
Can any one respond to my questions. Is it a good idea to disable auto
compaction and schedule it every 3 days. I am unable to control compaction
and it is causing timeouts.
Also will reducing or increasing compacti
g, to assist in resolving complaints and to improve our customer
service, email communications may be monitored and telephone calls may be
recorded.
rajasekhar kommineni
09/19/2018 04:44 PM
Please respond to
user@cassandra.apache.org
To
user@cassandra.apache.org,
cc
Subject
Re: Compa
It’s not recommended to disable compaction, you will end up with hundreds to
thousands of sstables and increased read latency. If your data is immitable,
means no update/deletes it will have least impact.
Decreasing compaction throughput will release resources for application but
don’t accumula
Hello,
Can any one respond to my questions. Is it a good idea to disable auto
compaction and schedule it every 3 days. I am unable to control compaction and
it is causing timeouts.
Also will reducing or increasing compaction_throughput_mb_per_sec eliminate
timeouts ?
Thanks,
> On Sep 17, 2
>
> I wouldn't use TWCS if there's updates, you're going to risk having
> data that's never deleted and really small sstables sticking around
> forever.
How do you risk having data sticking around forever when everything is
TTL'd?
If you use really large buckets, what's the point of TWCS?
No one
I wouldn't use TWCS if there's updates, you're going to risk having
data that's never deleted and really small sstables sticking around
forever. If you use really large buckets, what's the point of TWCS?
Honestly this is such a small workload you could easily use STCS or
LCS and you'd likely neve
TWCS is probably still worth trying. If you mean updating old rows in TWCS
"out of order updates" will only really mean you'll hit more SSTables on
read. This might add a bit of complexity in your client if your bucketing
partitions (not strictly necessary), but that's about it. As long as you're
n
Ah, clear then. SSD usage imposes a different bias in terms of costs;-)
On Tue, Nov 25, 2014 at 9:48 PM, Nikolai Grigoriev wrote:
> Andrei,
>
> Oh, yes, I have scanned the top of your previous email but overlooked the
> last part.
>
> I am using SSDs so I prefer to put extra work to keep my syste
Andrei,
Oh, yes, I have scanned the top of your previous email but overlooked the
last part.
I am using SSDs so I prefer to put extra work to keep my system performing
and save expensive disk space. So far I've been able to size the system
more or less correctly so these LCS limitations do not ca
Nikolai,
Just in case you've missed my comment in the thread (guess you have) -
increasing sstable size does nothing (in our case at least). That is,
it's not worse but the load pattern is still the same - doing nothing
most of the time. So, I switched to STCS and we will have to live with
extra s
Hi Jean-Armel,
I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but
there are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra
2.0.10.
I have about 1,8Tb of data per node now in total, which falls into that
range.
As I said, it is really a problem with large amoun
Yep, Marcus, I know. It's mainly a question of cost of those extra x2
disks, you know. Our "final" setup will be more like 30TB, so doubling
it is still some cost. But i guess, we will have to live with it
On Tue, Nov 25, 2014 at 1:26 PM, Marcus Eriksson wrote:
> If you are that write-heavy you s
If you are that write-heavy you should definitely go with STCS, LCS
optimizes for reads by doing more compactions
/Marcus
On Tue, Nov 25, 2014 at 11:22 AM, Andrei Ivanov wrote:
> Hi Jean-Armel, Nikolai,
>
> 1. Increasing sstable size doesn't work (well, I think, unless we
> "overscale" - add mo
Hi Jean-Armel, Nikolai,
1. Increasing sstable size doesn't work (well, I think, unless we
"overscale" - add more nodes than really necessary, which is
prohibitive for us in a way). Essentially there is no change. I gave
up and will go for STCS;-(
2. We use 2.0.11 as of now
3. We are running on EC
Hi Andrei, Hi Nicolai,
Which version of C* are you using ?
There are some recommendations about the max storage per node :
http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
"For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to
handle 10x
(3-5TB)".
I have
On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev
wrote:
> One of the obvious recommendations I have received was to run more than
> one instance of C* per host. Makes sense - it will reduce the amount of
> data per node and will make better use of the resources.
>
This is usually a Bad Idea to
sure if this node will ever get back to normal life. And
>> >> >>>> believe me
>> >> >>>> - this is not because of I/O, I have SSDs everywhere and 16
>> >> >>>> physical
>> >> >>>> cores.
>> >> >>&
lose to 2x disk
> space
> >> >>>> on
> >> >>>> EVERY disk in my JBOD configuration...this will kill the node
> sooner
> >> >>>> or
> >> >>>> later. This is all because all sstables after bootstrap end at L0
> and
> &g
otstrap end at L0 and
>> >>>> then
>> >>>> the process slowly slowly moves them to other levels. If you have
>> >>>> write
>> >>>> traffic to that CF then the number of sstables and L0 will grow
>> >>
e
> https://issues.apache.org/jira/browse/CASSANDRA-8301
> >>>> is implemented it may be better.
> >>>>
> >>>>
> >>>> On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov
> >>>> wrote:
> >>>>>
&g
ewhat similar C* load profile. Hence some comments
>>>>> in addition Nikolai's answer.
>>>>> 1. Fallback to STCS - you can disable it actually
>>>>> 2. Based on our experience, if you have a lot of data per node, LCS
>>>>> may work j
ill not be able
>>>> to compact what it gets from old nodes. In your case, if you switch
>>>> strategy the same thing may happen. This is all due to limitations
>>>> mentioned by Nikolai.
>>>>
>>>> Andrei,
>>>>
>>>>
may work just fine. That is, till the moment you decide to join
>>>> another node - chances are that the newly added node will not be able
>>>> to compact what it gets from old nodes. In your case, if you switch
>>>> strategy the same thing may happen. This is all due to li
ai.
>>>
>>> Andrei,
>>>
>>>
>>> On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G.
>>> wrote:
>>> > ABUSE
>>> >
>>> >
>>> >
>>> > YA NO QUIERO MAS MAILS SOY DE MEXICO
>>> >
t;
>> >
>> > YA NO QUIERO MAS MAILS SOY DE MEXICO
>> >
>> >
>> >
>> > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
>> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
>> > Para: user@cassandra.apache.org
>>
gt;
> >
> >
> > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
> > Para: user@cassandra.apache.org
> > Asunto: Re: Compaction Strategy guidance
> > Importancia: Alta
> >
> >
> >
t; ABUSE
>
>
>
> YA NO QUIERO MAS MAILS SOY DE MEXICO
>
>
>
> De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
> Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
> Para: user@cassandra.apache.org
> Asunto: Re: Compaction Strategy guidance
> Importancia: A
ABUSE
YA NO QUIERO MAS MAILS SOY DE MEXICO
De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
Para: user@cassandra.apache.org
Asunto: Re: Compaction Strategy guidance
Importancia: Alta
Stephane,
As everything good, LCS comes at
Stephane,
As everything good, LCS comes at certain price.
LCS will put most load on you I/O system (if you use spindles - you may
need to be careful about that) and on CPU. Also LCS (by default) may fall
back to STCS if it is falling behind (which is very possible with heavy
writing activity) and
You are of course free to reduce the min per bucket to 2.
The fundamental idea of sstables + compaction is to trade disk space
for higher write performance. For most applications this is the right
trade to make on modern hardware... I don't think you'll get very far
trying to get the 2nd without t
>
>
> Not sure I follow you. 4 sstables is the minimum compaction look for
> (by default).
> If there is 30 sstables of ~20MB sitting there because compaction is
> behind, you
> will compact those 30 sstables together (unless there is not enough space
> for
> that and considering you haven't change
On Tue, May 10, 2011 at 6:20 PM, Terje Marthinussen
wrote:
>
>> Everyone may be well aware of that, but I'll still remark that a minor
>> compaction
>> will try to merge "as many 20MB sstables as it can" up to the max
>> compaction
>> threshold (which is configurable). So if you do accumulate some
> Everyone may be well aware of that, but I'll still remark that a minor
> compaction
> will try to merge "as many 20MB sstables as it can" up to the max
> compaction
> threshold (which is configurable). So if you do accumulate some newly
> created
> sstable at some point in time, the next minor co
Sorry, I was referring to the claim that "one big file" was a problem, not
the non-overlapping part.
If you never compact to a single file, you never get rid of all
generations/duplicates.
With non-overlapping files covering small enough token ranges, compacting
down to one file is not a big issue
If they each have their own copy of the data, then they are *not*
non-overlapping!
If you have non-overlapping SSTables (and you know the min/max keys), it's
like having one big SSTable because you know exactly where each row is, and
it becomes easy to merge a new SSTable in small batches, rather
Yes, agreed.
I actually think cassandra has to.
And if you do not go down to that single file, how do you avoid getting into
a situation where you can very realistically end up with 4-5 big sstables
each having its own copy of the same data massively increasing disk
requirements?
Terje
On Mon,
"I'm also not too much in favor of triggering major compactions, because it
mostly have a nasty effect (create one huge sstable)."
If that is the case, why can't major compactions create many,
non-overlapping SSTables?
In general, it seems to me that non-overlapping SSTables have all the
advantag
On Sat, May 7, 2011 at 7:20 PM, Terje Marthinussen
wrote:
> This is an all ssd system. I have no problems with read/write performance
> due to I/O.
> I do have a potential with the crazy explosion you can get in terms of disk
> use if compaction cannot keep up.
>
> As things falls behind and you g
> It does not really make sense to me to go through all these minor merges
> when a full compaction will do a much faster and better job.
In a system heavily reliant on caching (platter drives, large data
sizes, much larger than RAM) major compactions can be very detrimental
to performance due to
This is an all ssd system. I have no problems with read/write performance
due to I/O.
I do have a potential with the crazy explosion you can get in terms of disk
use if compaction cannot keep up.
As things falls behind and you get many generations of data, yes, read
performance gets a problem due
> If you are seeing 600 pending compaction tasks regularly you almost
> definitely need more hardware.
Note that pending compactions is pretty misleading and you can't
really draw conclusions just based on the pending compactions
number/graph. For example, standard behavior during e.g.a long repa
On Sat, May 7, 2011 at 8:54 AM, Jonathan Ellis wrote:
> On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen
> wrote:
>> 1. Would it make sense to make full compactions occur a bit more aggressive.
>
> I'd rather reduce the performance impact of being behind, than do more
> full compactions: https:
On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen
wrote:
> 1. Would it make sense to make full compactions occur a bit more aggressive.
I'd rather reduce the performance impact of being behind, than do more
full compactions: https://issues.apache.org/jira/browse/CASSANDRA-2498
> 2. I
> would th
44 matches
Mail list logo