Awsome to know this! Thanks Jon and DuyHai!
Regards, Carlos Juzarte Rolo Cassandra Consultant / Datastax Certified Architect / Cassandra MVP Pythian - Love your data rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: *linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>* Mobile: +351 918 918 100 www.pythian.com On Wed, Feb 1, 2017 at 6:57 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: > The optimization is there. The entire sstable can be dropped but it's not > because of the default TTL. The default TTL only applies if a TTL isn't > specified explicitly. The default TTL can't be used to drop a table > automatically since it can be overridden at insert time. Check out this > example. The first insert uses the default TTL. The second insert > overrides the default. Using the default TTL to drop the sstable would be > pretty terrible in this case: > > CREATE TABLE test.b ( > k int PRIMARY KEY, > v int > ) WITH default_time_to_live = 10000; > > insert into b (k, v) values (1, 1); > cqlsh:test> select k, v, TTL(v) from b where k = 1; > > k | v | ttl(v) > ---+---+-------- > 1 | 1 | 9943 > > (1 rows) > > cqlsh:test> insert into b (k, v) values (2, 1) USING TTL 99999999; > cqlsh:test> select k, v, TTL(v) from b where k = 2; > > k | v | ttl(v) > ---+---+---------- > 2 | 1 | 99999995 > > (1 rows) > > TL;DR: The default TTL is there as a convenience so you don't have to keep > the TTL in your code. From a performance perspective it does not matter. > > Jon > > > On Wed, Feb 1, 2017 at 10:39 AM DuyHai Doan <doanduy...@gmail.com> wrote: > >> I was referring to this JIRA https://issues.apache. >> org/jira/browse/CASSANDRA-3974 when talking about dropping entire >> SSTable at compaction time >> >> But the JIRA is pretty old and it is very possible that the optimization >> is no longer there >> >> >> >> On Wed, Feb 1, 2017 at 6:53 PM, Jonathan Haddad <j...@jonhaddad.com> >> wrote: >> >> This is incorrect, there's no optimization used that references the table >> level TTL setting. The max local deletion time is stored in table >> metadata. See >> org.apache.cassandra.io.sstable.metadata.StatsMetadata#maxLocalDeletionTime >> in the Cassandra 3.0 branch. The default ttl is stored >> here: org.apache.cassandra.schema.TableParams#defaultTimeToLive and is >> never referenced during compaction. >> >> Here's an example from a table I created without a default TTL, you can >> use the sstablemetadata tool to see: >> >> jhaddad@rustyrazorblade ~/dev/cassandra/data/data/test$ >> ../../../tools/bin/sstablemetadata a-7bca6b50e8a511e6869a5596edf4dd >> 35/mc-1-big-Data.db >> ..... >> SSTable max local deletion time: 1485980862 >> >> On Wed, Feb 1, 2017 at 6:59 AM DuyHai Doan <doanduy...@gmail.com> wrote: >> >> Global TTL is better than dynamic runtime TTL >> >> Why ? >> >> Because Global TTL is a table property and Cassandra can perform >> optimization when compacting. >> >> For example if it can see than the maxTimestamp of an SSTable is older >> than the table Global TTL, the SSTable can be entirely dropped during >> compaction >> >> Using dynamic TTL at runtime, since Cassandra doesn't how and cannot >> track each individual TTL value, the previous optimization is not possible >> (even if you always use the SAME TTL for all query, Cassandra is not >> supposed to know that) >> >> >> >> On Wed, Feb 1, 2017 at 3:01 PM, Cogumelos Maravilha < >> cogumelosmaravi...@sapo.pt> wrote: >> >> Thank you all, for your answers. >> >> On 02/01/2017 01:06 PM, Carlos Rolo wrote: >> >> To reinforce Alain statement: >> >> "I would say that the unsafe part is more about using C* 3.9" this is >> key. You would be better on 3.0.x unless you need features on the 3.x >> series. >> >> Regards, >> >> Carlos Juzarte Rolo >> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP >> >> Pythian - Love your data >> >> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: >> *linkedin.com/in/carlosjuzarterolo >> <http://linkedin.com/in/carlosjuzarterolo>* >> Mobile: +351 918 918 100 <+351%20918%20918%20100> >> www.pythian.com >> >> On Wed, Feb 1, 2017 at 8:32 AM, Alain RODRIGUEZ <arodr...@gmail.com> >> wrote: >> >> Is it safe to use TWCS in C* 3.9? >> >> >> I would say that the unsafe part is more about using C* 3.9 than using >> TWCS in C*3.9 :-). I see no reason to say 3.9 would be specifically unsafe >> in C*3.9, but I might be missing something. >> >> Going from STCS to TWCS is often smooth, from LCS you might expect an >> extra load compacting a lot (all?) of the SSTable from what we saw from the >> field. In this case, be sure that your compaction options are safe enough >> to handle this. >> >> TWCS is even easier to use on C*3.0.8+ and C*3.8+ as it became the new >> default replacing TWCS, so no extra jar is needed, you can enable TWCS as >> any other default compaction strategy. >> >> C*heers, >> ----------------------- >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> 2017-01-31 23:29 GMT+01:00 Cogumelos Maravilha < >> cogumelosmaravi...@sapo.pt>: >> >> Hi Alain, >> >> Thanks for your response and the links. >> >> I've also checked "Time series data model and tombstones". >> >> Is it safe to use TWCS in C* 3.9? >> >> Thanks in advance. >> >> On 31-01-2017 11:27, Alain RODRIGUEZ wrote: >> >> Is there a overhead using line by line option or wasted disk space? >> >> There is a very recent topic about that in the mailing list, look for "Time >> series data model and tombstones". I believe DuyHai answer your question >> there with more details :). >> >> *tl;dr:* >> >> Yes, if you know the TTL in advance, and it is fixed, you might want to >> go with the table option instead of adding the TTL in each insert. Also you >> might want consider using TWCS compaction strategy. >> >> Here are some blogposts my coworkers recently wrote about TWCS, it might >> be useful: >> >> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html >> http://thelastpickle.com/blog/2017/01/10/twcs-part2.html >> >> C*heers, >> ----------------------- >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> >> 2017-01-31 10:43 GMT+01:00 Cogumelos Maravilha < >> cogumelosmaravi...@sapo.pt>: >> >> Hi I'm just wondering what option is fastest: >> >> Global:*create table xxx (.....**AND **default_time_to_live = **XXX**;** >> and**UPDATE xxx USING TTL XXX;* >> >> Line by line: >> *INSERT INTO xxx (...** USING TTL xxx;* >> >> Is there a overhead using line by line option or wasted disk space? >> >> Thanks in advance. >> >> >> >> >> >> >> -- --