Re: Global TTL vs Insert TTL

Romain Hardouin Thu, 02 Feb 2017 04:39:56 -0800

Default TTL is nice to provide information on tables for ops guys. I mean we 
know that data in such tables are ephemeral at a glance.


    Le Mercredi 1 février 2017 21h47, Carlos Rolo <r...@pythian.com> a écrit :
 

 Awsome to know this!

Thanks Jon and DuyHai!

Regards,

Carlos Juzarte RoloCassandra Consultant / Datastax Certified Architect / 
Cassandra MVP
 Pythian - Love your data
rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
linkedin.com/in/carlosjuzarterolo 
Mobile: +351 918 918 100 
www.pythian.com
On Wed, Feb 1, 2017 at 6:57 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

The optimization is there.  The entire sstable can be dropped but it's not 
because of the default TTL.  The default TTL only applies if a TTL isn't 
specified explicitly.  The default TTL can't be used to drop a table 
automatically since it can be overridden at insert time.  Check out this 
example.  The first insert uses the default TTL.  The second insert overrides 
the default.  Using the default TTL to drop the sstable would be pretty 
terrible in this case:
CREATE TABLE test.b (    k int PRIMARY KEY,    v int) WITH default_time_to_live 
= 10000;
insert into b (k, v) values (1, 1);
cqlsh:test> select k, v, TTL(v) from b  where k = 1;
 k | v | ttl(v)---+---+-------- 1 | 1 |   9943
(1 rows)
cqlsh:test> insert into b (k, v) values (2, 1) USING TTL 99999999;cqlsh:test> 
select k, v, TTL(v) from b  where k = 2;
 k | v | ttl(v)---+---+---------- 2 | 1 | 99999995
(1 rows)
TL;DR: The default TTL is there as a convenience so you don't have to keep the 
TTL in your code.  From a performance perspective it does not matter.
Jon

On Wed, Feb 1, 2017 at 10:39 AM DuyHai Doan <doanduy...@gmail.com> wrote:

I was referring to this JIRA https://issues.apache. 
org/jira/browse/CASSANDRA-3974 when talking about dropping entire SSTable at 
compaction time
But the JIRA is pretty old and it is very possible that the optimization is no 
longer there



On Wed, Feb 1, 2017 at 6:53 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

This is incorrect, there's no optimization used that references the table level 
TTL setting.   The max local deletion time is stored in table metadata.  See 
org.apache.cassandra.io. sstable.metadata. StatsMetadata# maxLocalDeletionTime 
in the Cassandra 3.0 branch.    The default ttl is stored here: 
org.apache.cassandra. schema.TableParams# defaultTimeToLive and is never 
referenced during compaction.
Here's an example from a table I created without a default TTL, you can use the 
sstablemetadata tool to see:
jhaddad@rustyrazorblade ~/dev/cassandra/data/data/ test$ ../../../tools/bin/ 
sstablemetadata a- 7bca6b50e8a511e6869a5596edf4dd 
35/mc-1-big-Data.db.....SSTable max local deletion time: 1485980862
On Wed, Feb 1, 2017 at 6:59 AM DuyHai Doan <doanduy...@gmail.com> wrote:

Global TTL is better than dynamic runtime TTL
Why ?
 Because Global TTL is a table property and Cassandra can perform optimization 
when compacting.
For example if it can see than the maxTimestamp of an SSTable is older than the 
table Global TTL, the SSTable can be entirely dropped during compaction
Using dynamic TTL at runtime, since Cassandra doesn't how and cannot track each 
individual TTL value, the previous optimization is not possible (even if you 
always use the SAME TTL for all query, Cassandra is not supposed to know that)


On Wed, Feb 1, 2017 at 3:01 PM, Cogumelos Maravilha 
<cogumelosmaravi...@sapo.pt> wrote:

  Thank you all, for your answers.
  
 On 02/01/2017 01:06 PM, Carlos Rolo wrote:
  
 To reinforce Alain statement:
 
 "I would say that the unsafe part is more about using C* 3.9" this is key. You 
would be better on 3.0.x unless you need features on the 3.x series.
 
            Regards,
  
  Carlos Juzarte Rolo Cassandra Consultant / Datastax Certified Architect / 
Cassandra MVP
    Pythian - Love your data 
  rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: linkedin.com/in/ 
carlosjuzarterolo 
  Mobile: +351 918 918 100 
  www.pythian.com            
 On Wed, Feb 1, 2017 at 8:32 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
 
  
Is it safe to use TWCS in C* 3.9?
 
  I would say that the unsafe part is more about using C* 3.9 than using TWCS 
in C*3.9 :-). I see no reason to say 3.9 would be specifically unsafe in C*3.9, 
but I might be missing something. 
  Going from STCS to TWCS is often smooth, from LCS you might expect an extra 
load compacting a lot (all?) of the SSTable from what we saw from the field. In 
this case, be sure that your compaction options are safe enough to handle this. 
  TWCS is even easier to use on C*3.0.8+ and C*3.8+ as it became the new 
default replacing TWCS, so no extra jar is needed, you can enable TWCS as any 
other default compaction strategy.  
  C*heers,  ----------------------- Alain Rodriguez - @arodream - 
al...@thelastpickle.com France 
  The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com    
 
 2017-01-31 23:29 GMT+01:00 Cogumelos Maravilha <cogumelosmaravi...@sapo.pt>:
 
  Hi Alain, Thanks for your response and the links. I've also checked "Time 
series data model and tombstones".  Is it safe to use TWCS in C* 3.9? Thanks in 
advance.
    
 On 31-01-2017 11:27, Alain RODRIGUEZ wrote:
  
  
 Is there a overhead using line by line option or wasted disk space? 
  There is a very recent topic about that in the mailing list, look for "Time 
series data model and tombstones". I believe DuyHai answer your question there 
with more details :).
  
  tl;dr: 
  Yes, if you know the TTL in advance, and it is fixed, you might want to go 
with the table option instead of adding the TTL in each insert. Also you might 
want consider using TWCS compaction strategy. 
  Here are some blogposts my coworkers recently wrote about TWCS, it might be 
useful: 
  http://thelastpickle.com/blog/ 2016/12/08/TWCS-part1.html
 http://thelastpickle.com/blog/ 2017/01/10/twcs-part2.html
 
 C*heers,  ----------------------- Alain Rodriguez - @arodream - 
al...@thelastpickle.com France 
  The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com  
 
   
 2017-01-31 10:43 GMT+01:00 Cogumelos Maravilha <cogumelosmaravi...@sapo.pt>:
 
  Hi I'm just wondering what option is fastest: Global:
create table xxx (.....
AND default_time_to_live = XXX;

and

UPDATE xxx USING TTL XXX;

Line by line:

INSERT INTO xxx (...
 USING TTL xxx;

Is there a overhead using line by line option or wasted disk space?

Thanks in advance.
    
  
  
 
    
     
    
 
 








--

Re: Global TTL vs Insert TTL

Reply via email to