unsubscribe.
> On Nov 19, 2015, at 3:58 PM, Antoine Bonavita <anto...@stickyads.tv> wrote:
>
> Sebastian,
>
> I took into account your suggestion and set max_sstable_age_days to 1.
>
> I left the TTL at 432000 and the gc_grace_seconds at 172800. So, I expect
> SSTable older than 7 days to get deleted. Am I right ?
>
> I did not change dclocal_read_repair_chance because I have only one DC at
> this point in time. Did you mean that I should set read_repair_chance to 0 ?
>
> Thanks again for your time and help. Really appreciated.
>
> A.
>
>
> On 11/19/2015 02:36 AM, Sebastian Estevez wrote:
>> When you say drop you mean reduce the value (to 1 day for example),
>> not "don't set the value", right ?
>>
>>
>> Yes.
>>
>> If I set max sstable age days to 1, my understanding is that
>> SSTables with expired data (5 days) are not going to be compacted
>> ever. And therefore my disk usage will keep growing forever. Did I
>> miss something here ?
>>
>>
>> We will expire sstables who's highest TTL is beyond gc_grace_seconds as
>> of CASSANDRA-5228
>> <https://issues.apache.org/jira/browse/CASSANDRA-5228>. This is nice
>> because the sstable is just dropped for free, no need to scan it and
>> remove tombstones which is very expensive and DTCS will guarantee that
>> all the data within an sstable is close together in time.
>>
>> So, if I set max sstable age days to 1, I have to run repairs at
>> least once a day, correct ?
>>
>> I'm afraid I don't get your point about painful compactions.
>>
>>
>> I was referring to the problems described here CASSANDRA-9644
>> <https://issues.apache.org/jira/browse/CASSANDRA-9644>
>>
>>
>>
>>
>> All the best,
>>
>>
>> datastax_logo.png <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect |954 905 8615 | sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>
>>
>> linkedin.png <https://www.linkedin.com/company/datastax>facebook.png
>> <https://www.facebook.com/datastax>twitter.png
>> <https://twitter.com/datastax>g+.png
>> <https://plus.google.com/+Datastax/about><http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to
>> any size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Wed, Nov 18, 2015 at 5:53 PM, Antoine Bonavita <anto...@stickyads.tv
>> <mailto:anto...@stickyads.tv>> wrote:
>>
>> Sebastian,
>>
>> Your help is very much appreciated. I re-read the blog post and also
>> https://labs.spotify.com/2014/12/18/date-tiered-compaction/ but some
>> things are still confusing me.
>>
>> Please see my questions inline below.
>>
>> On 11/18/2015 04:21 PM, Sebastian Estevez wrote:
>>
>> Yep, I think you've mixed up your DTCS levers. I would read, or
>> re-read
>> Marcus's post
>> http://www.datastax.com/dev/blog/datetieredcompactionstrategy
>>
>> *TL;DR:*
>>
>> * *base_time_seconds* is the size of your initial window
>> * *max_sstable_age_days* is the time after which you stop
>> compacting
>> sstables
>> * *default_time_to_live* is the time after which data expires and
>> sstables will start to become available for GC. (432000 is
>> 5 days)
>>
>>
>> Could it be that compaction is putting those in cache
>> constantly?
>>
>>
>> Yep, you'll keep compacting sstables until they're 10 days old
>> per your
>> current settings and when you compact there are reads and then
>> writes.
>>
>>
>>
>> If you aren't doing any updates and most of your reads are within 1
>> hour, you can probably afford to drop max sstable age days.
>>
>> When you say drop you mean reduce the value (to 1 day for example),
>> not "don't set the value", right ?
>>
>> If I set max sstable age days to 1, my understanding is that
>> SSTables with expired data (5 days) are not going to be compacted
>> ever. And therefore my disk usage will keep growing forever. Did I
>> miss something here ?
>>
>> Just make
>> sure you're doing your repairs more often than the max sstable
>> age days
>> to avoid some painful compactions.
>>
>> So, if I set max sstable age days to 1, I have to run repairs at
>> least once a day, correct ?
>> I'm afraid I don't get your point about painful compactions.
>>
>> Along the same lines, you should probably set
>> dclocal_read_repair_chance
>> to 0
>>
>> Will try that.
>>
>>
>> Regarding the heap configuration, both are very similar
>>
>>
>> Probably unrelated but, is there a reason why they're not identical?
>> Especially the different new gen size could have gc implications.
>>
>> Both are calculated by cassandra-env.sh. If my bash skills are still
>> intact, the NewGen size difference comes from the number of cores:
>> the 64G machine has 12 cores where the 32G machine has 8 cores (I
>> did not even realize this before looking into this, that's why I did
>> not mention it in my previous emails).
>>
>> Thanks a lot for your help.
>>
>> A.
>>
>>
>>
>>
>>
>> All the best,
>>
>>
>> datastax_logo.png <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect |954 905 8615 <tel:954%20905%208615> |
>> sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>
>> <mailto:sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>>
>>
>> linkedin.png <https://www.linkedin.com/company/datastax>facebook.png
>> <https://www.facebook.com/datastax>twitter.png
>> <https://twitter.com/datastax>g+.png
>>
>> <https://plus.google.com/+Datastax/about><http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>>
>> DataStax is the fastest, most scalable distributed database
>> technology,
>> delivering Apache Cassandra to the world’s most innovative
>> enterprises.
>> Datastax is built to be agile, always-on, and predictably
>> scalable to
>> any size. With more than 500 customers in 45 countries, DataStax
>> is the
>> database technology and transactional backbone of choice for the
>> worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Wed, Nov 18, 2015 at 6:44 AM, Antoine Bonavita
>> <anto...@stickyads.tv <mailto:anto...@stickyads.tv>
>> <mailto:anto...@stickyads.tv <mailto:anto...@stickyads.tv>>> wrote:
>>
>> Sebastian, Robet,
>>
>> First, a big thank you to both of you for your help.
>>
>> It looks like you were right. I used pcstat (awesome tool,
>> thanks
>> for that as well) and it appears some files I would not
>> expect to be
>> in cache actually are. Here is a sample of my output
>> (edited for
>> convenience, adding the file timestamp from the OS):
>>
>> *
>>
>>
>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5951-big-Data.db
>> - 000.619 % - Nov 16 12:25
>> *
>>
>>
>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5954-big-Data.db
>> - 000.681 % - Nov 16 13:44
>> *
>>
>>
>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5955-big-Data.db
>> - 000.610 % - Nov 16 14:11
>> *
>>
>>
>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5956-big-Data.db
>> - 015.621 % - Nov 16 14:26
>> *
>>
>>
>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5957-big-Data.db
>> - 015.558 % - Nov 16 14:50
>>
>> The SSTables that come before are all at about 0% and the
>> ones that
>> come after it are all at about 15%.
>>
>> As you can see the first SSTable at 15% date back from 24h.
>> Given my
>> application I'm pretty sure those are not from the reads
>> (reads of
>> data older than 1h is definitely under 0.1% of reads).
>> Could it be
>> that compaction is putting those in cache constantly ?
>> If so, then I'm probably confused on the meaning/effect of
>> max_sstable_age_days (set at 10 in my case) and
>> base_time_seconds
>> (not set in my case so the default of 3600 applies). I
>> would not
>> expect any compaction to happen beyond the first hour and
>> the 10
>> days is here to make sure data still gets expired and SSTables
>> removed (thus releasing disk space). I don't see where the
>> 24h come
>> from.
>> If you guys can shed some light on this, it would be
>> awesome. I'm
>> sure I got something wrong.
>>
>> Regarding the heap configuration, both are very similar:
>> * 32G machine: -Xms8049M -Xmx8049M -Xmn800M
>> * 64G machine: -Xms8192M -Xmx8192M -Xmn1200M
>> I think we can rule that out.
>>
>> Thanks again for you help, I truly appreciate it.
>>
>> A.
>>
>> On 11/17/2015 08:48 PM, Robert Coli wrote:
>>
>> On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez
>> <sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>
>> <mailto:sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>>
>> <mailto:sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>
>> <mailto:sebastian.este...@datastax.com
>> <mailto:sebastian.este...@datastax.com>>>>
>> wrote:
>>
>> You're sstables are probably falling out of page
>> cache on the
>> smaller nodes and your slow disks are killing your
>> latencies.
>>
>>
>> +1 most likely.
>>
>> Are the heaps the same size on both machines?
>>
>> =Rob
>>
>>
>> --
>> Antoine Bonavita (anto...@stickyads.tv
>> <mailto:anto...@stickyads.tv>
>> <mailto:anto...@stickyads.tv
>> <mailto:anto...@stickyads.tv>>) - CTO StickyADS.tv
>> Tel: +33 6 34 33 47 36 <tel:%2B33%206%2034%2033%2047%2036>
>> <tel:%2B33%206%2034%2033%2047%2036>/+33 9 50
>> 68 21 32 <tel:%2B33%209%2050%2068%2021%2032>
>> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN |
>> MADRID
>>
>>
>>
>> --
>> Antoine Bonavita (anto...@stickyads.tv
>> <mailto:anto...@stickyads.tv>) - CTO StickyADS.tv
>> Tel: +33 6 34 33 47 36 <tel:%2B33%206%2034%2033%2047%2036>/+33 9 50
>> 68 21 32 <tel:%2B33%209%2050%2068%2021%2032>
>> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID
>>
>>
>
> --
> Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv
> Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID