Re: Help diagnosing performance issue

Sebastian Estevez Wed, 18 Nov 2015 07:23:15 -0800

Yep, I think you've mixed up your DTCS levers. I would read, or re-read
Marcus's post http://www.datastax.com/dev/blog/datetieredcompactionstrategy


*TL;DR:*

   - *base_time_seconds*  is the size of your initial window
   - *max_sstable_age_days* is the time after which you stop compacting
   sstables
   - *default_time_to_live* is the time after which data expires and
   sstables will start to become available for GC. (432000 is 5 days)


Could it be that compaction is putting those in cache constantly?


Yep, you'll keep compacting sstables until they're 10 days old per your
current settings and when you compact there are reads and then writes.



If you aren't doing any updates and most of your reads are within 1 hour,
you can probably afford to drop max sstable age days. Just make sure you're
doing your repairs more often than the max sstable age days to avoid some
painful compactions.

Along the same lines, you should probably set dclocal_read_repair_chance to
0





Regarding the heap configuration, both are very similar


Probably unrelated but, is there a reason why they're not identical?
Especially the different new gen size could have gc implications.




All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 18, 2015 at 6:44 AM, Antoine Bonavita <anto...@stickyads.tv>
wrote:

> Sebastian, Robet,
>
> First, a big thank you to both of you for your help.
>
> It looks like you were right. I used pcstat (awesome tool, thanks for that
> as well) and it appears some files I would not expect to be in cache
> actually are. Here is a sample of my output (edited for convenience, adding
> the file timestamp from the OS):
>
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5951-big-Data.db
> - 000.619 % - Nov 16 12:25
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5954-big-Data.db
> - 000.681 % - Nov 16 13:44
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5955-big-Data.db
> -  000.610 % - Nov 16 14:11
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5956-big-Data.db
> - 015.621 % - Nov 16 14:26
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5957-big-Data.db
> - 015.558 % - Nov 16 14:50
>
> The SSTables that come before are all at about 0% and the ones that come
> after it are all at about 15%.
>
> As you can see the first SSTable at 15% date back from 24h. Given my
> application I'm pretty sure those are not from the reads (reads of data
> older than 1h is definitely under 0.1% of reads). Could it be that
> compaction is putting those in cache constantly ?
> If so, then I'm probably confused on the meaning/effect of
> max_sstable_age_days (set at 10 in my case) and base_time_seconds (not set
> in my case so the default of 3600 applies). I would not expect any
> compaction to happen beyond the first hour and the 10 days is here to make
> sure data still gets expired and SSTables removed (thus releasing disk
> space). I don't see where the 24h come from.
> If you guys can shed some light on this, it would be awesome. I'm sure I
> got something wrong.
>
> Regarding the heap configuration, both are very similar:
> * 32G machine: -Xms8049M -Xmx8049M -Xmn800M
> * 64G machine: -Xms8192M -Xmx8192M -Xmn1200M
> I think we can rule that out.
>
> Thanks again for you help, I truly appreciate it.
>
> A.
>
> On 11/17/2015 08:48 PM, Robert Coli wrote:
>
>> On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez
>> <sebastian.este...@datastax.com <mailto:sebastian.este...@datastax.com>>
>> wrote:
>>
>>     You're sstables are probably falling out of page cache on the
>>     smaller nodes and your slow disks are killing your latencies.
>>
>>
>> +1 most likely.
>>
>> Are the heaps the same size on both machines?
>>
>> =Rob
>>
>
> --
> Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv
> Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID
>

Re: Help diagnosing performance issue

Reply via email to