Cassandra snapshot with TTL

2024-11-20 Thread edi mari
Hi, I'm attempting to use the `snapshot` command with the `--tt` option, but I keep encountering an error. Can you help me figure out what I might be doing wrong? I'm using Cassandra v4.1.5. *nodetool snapshot --ttl 1HOURS -t my_snapshot my_keyspace* nodetool: Invalid duratio

Re: Cassandra snapshot with TTL

2024-11-19 Thread edi mari
gt;> Can you help me figure out what I might be doing wrong? >> >> I'm using Cassandra v4.1.5. >> >> *nodetool snapshot --ttl 1HOURS -t my_snapshot my_keyspace* >> >> nodetool: Invalid duration: 1[HOURS] Accepted units:[NANOSECONDS, >> MICROSECONDS,

Re: Cassandra snapshot with TTL

2024-11-19 Thread guo Maxwell
.1.5. > > *nodetool snapshot --ttl 1HOURS -t my_snapshot my_keyspace* > > nodetool: Invalid duration: 1[HOURS] Accepted units:[NANOSECONDS, > MICROSECONDS, MILLISECONDS, SECONDS, MINUTES, HOURS, DAYS] where case > matters and only non-negative values. > See 'nodetool help' or 'nodetool help '. > > Thanks > Edi >

Re: Documentation about TTL and tombstones

2024-03-18 Thread Sebastian Marsching
you would get rid of the tombstones before running > a repair, you might have nodes that still has that data. > Then following a repair, that data will be copied to other replicas, and that > data you thought you deleted, will be brought back to life. Sure, for regular data that does not have a

Re: Documentation about TTL and tombstones

2024-03-17 Thread Gil Ganz
ay. > > I thought that gc_grace_seconds is all about being able to repair the > table before tombstones are removed, so that deleted data cannot repappear. > But when the data has a TTL, it should not matter whether the original data > ore the tombstone is synchronized as part of th

Re: Documentation about TTL and tombstones

2024-03-16 Thread Sebastian Marsching
thought that gc_grace_seconds is all about being able to repair the table before tombstones are removed, so that deleted data cannot repappear. But when the data has a TTL, it should not matter whether the original data ore the tombstone is synchronized as part of the repair process. After all, t

Re: Documentation about TTL and tombstones

2024-03-15 Thread Gil Ganz
ar 16 8am that data will be a tombstone, and only after March 26 8am, a compaction *might* remove it, if all other conditions are met. gil On Fri, Mar 15, 2024 at 12:58 AM Sebastian Marsching < sebast...@marsching.com> wrote: > > by reading the documentation about TTL > > https

Re: Documentation about TTL and tombstones

2024-03-14 Thread Sebastian Marsching
> by reading the documentation about TTL > https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl > It mention that it creates a tombstone when data expired, how does it > possible without writing to the tombstone on the table ? I thought TTL >

Documentation about TTL and tombstones

2024-03-14 Thread Jean Carlo
Hello community, by reading the documentation about TTL https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl It mention that it creates a tombstone when data expired, how does it possible without writing to the tombstone on the table ? I thought TTL doesn't c

Re: TTL and disk space releasing

2021-10-07 Thread Michel Barret
des don't consumer more data that expected! Thank you all dears for help Le 06/10/2021 à 17:59, Jeff Jirsa a écrit : I think this is a bit extreme. If you know that 100% of all queries that write to the table include a TTL, not having a TTL on the table is just fine. You just need to ensure t

Re: TTL and disk space releasing

2021-10-06 Thread Jeff Jirsa
I think this is a bit extreme. If you know that 100% of all queries that write to the table include a TTL, not having a TTL on the table is just fine. You just need to ensure that you always write correctly. On Wed, Oct 6, 2021 at 8:57 AM Bowen Song wrote: > TWCS without a table TTL is unlik

Re: TTL and disk space releasing

2021-10-06 Thread Bowen Song
TWCS without a table TTL is unlikely to work correctly, and adding the table TTL retrospectively alone is also unlikely to fix the existing issue. You may need to add the table default TTL and update all existing data to reflect the TTL change, and then trigger a major compaction to update the

Re: TTL and disk space releasing

2021-10-06 Thread Michel Barret
Hi, it's not set before. I set it to ensure all data have a ttl. Thanks for your help. Le 06/10/2021 à 13:47, Bowen Song a écrit : What is the the table's default TTL? (Note: it may be different than the TTL of the data in the table) On 06/10/2021 09:42, Michel Barret wrote: Hel

Re: TTL and disk space releasing

2021-10-06 Thread Michel Barret
Thank you for your pointers. sstablemetadata seem's explain that we have data without ttl (= 0) I don't know how can appears in ours system. - I replace our ttl by query by the default ttl. - I reduce the gc grace seconds to one day - I apply the unchecked_tombstone_compaction (on

Re: TTL and disk space releasing

2021-10-06 Thread Bowen Song
What is the the table's default TTL? (Note: it may be different than the TTL of the data in the table) On 06/10/2021 09:42, Michel Barret wrote: Hello, I try to use cassandra (3.11.5) with 8 nodes (in single datacenter). I use one simple table, all data are inserted with 31 days TTL

Re: TTL and disk space releasing

2021-10-06 Thread Paul Chandler
datacenter). I use > one simple table, all data are inserted with 31 days TTL (the data are never > updated). > > I use the TWCS strategy with: > - 'compaction_window_size': '24' > - 'compaction_window_unit': 'HOURS' > - 'ma

TTL and disk space releasing

2021-10-06 Thread Michel Barret
Hello, I try to use cassandra (3.11.5) with 8 nodes (in single datacenter). I use one simple table, all data are inserted with 31 days TTL (the data are never updated). I use the TWCS strategy with: - 'compaction_window_size': '24' - 'compaction_window_unit':

Re: Change of Cassandra TTL

2021-10-01 Thread Jim Shaw
. On Tue, Sep 14, 2021 at 6:10 AM raman gugnani wrote: > HI all, > > 1. I have a table with default_time_to_live = 31536000 (1 year) . We want > it to reduce the value to 7884000 (3 months). > If we alter the table , is there a way to update the existing data? > > 1. I have a

Re: Change of Cassandra TTL

2021-09-29 Thread Erick Ramirez
d it back. So I can imagine you do this and then you > re-enable TTL on it which is different. > > (1) https://github.com/instaclustr/cassandra-ttl-remover > > Regards. >

Re: Change of Cassandra TTL

2021-09-29 Thread Stefan Miklosovic
expired TTLs, removed them and voila, they had it back. So I can imagine you do this and then you re-enable TTL on it which is different. (1) https://github.com/instaclustr/cassandra-ttl-remover Regards. On Tue, 14 Sept 2021 at 16:24, raman gugnani wrote: > > Thanks Eric for the update. >

Re: TWCS on Non TTL Data

2021-09-20 Thread Bowen Song
By switching to this method, you will have two tables - the short-term storage table and the permanent storage table. The short-term table will use TWCS and have a TTL on it. The permanent table doesn't have TTL, and it will use STCS or LCS instead of TWCS. Whenever the application

RE: TWCS on Non TTL Data

2021-09-19 Thread Isaeed Mohanna
The point is that I am NOT using TTL and I need to keep the data, so when I do the switch to TWCS, will the old files be recompacted or they will remain the same and only new data coming in will use TWCS? From: Bowen Song Sent: Friday, September 17, 2021 9:04 PM To: user@cassandra.apache.org

Re: TWCS on Non TTL Data

2021-09-17 Thread Bowen Song
If you use TWCS with TTL, the old SSTables won't be compacted, the entire SSTable file will get dropped after it expires. I don't think you will need to manage the compaction or cleanup at all, as they are automatic. There's no space limit on the table holding the near-term data

RE: TWCS on Non TTL Data

2021-09-17 Thread Isaeed Mohanna
Non TTL Data You may try roll up the data, i.e. a table only 1 month data, old data roll up to a table keep a year data. Thanks, Jim On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna mailto:isa...@xsense.co>> wrote: My cluster column is the time series timestamp, so basically sourceId,

Re: TWCS on Non TTL Data

2021-09-15 Thread Jim Shaw
hange the time > window right? > > > > Thanks > > > > > > *From:* Jeff Jirsa > *Sent:* Tuesday, September 14, 2021 10:35 PM > *To:* cassandra > *Subject:* Re: TWCS on Non TTL Data > > > > Inline > > > > On Tue, Sep 14, 2021 at 11:4

RE: TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
week bucket, I could later change the time window right? Thanks From: Jeff Jirsa Sent: Tuesday, September 14, 2021 10:35 PM To: cassandra Subject: Re: TWCS on Non TTL Data Inline On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna mailto:isa...@xsense.co>> wrote: Hi Jeff My data is partitio

Re: TWCS on Non TTL Data

2021-09-14 Thread Jeff Jirsa
a lot for your help. > > > > > > > > > > > > > > > > > > *From:* Jeff Jirsa > *Sent:* Tuesday, September 14, 2021 4:51 PM > *To:* cassandra > *Subject:* Re: TWCS on Non TTL Data > > > > > > > > On Tue, Sep 14,

RE: TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
Subject: Re: TWCS on Non TTL Data On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna mailto:isa...@xsense.co>> wrote: Hi I have a table that stores time series data, the data is not TTLed since we want to retain the data for the foreseeable future, and there are no updates or deletes. (deletes

Re: Change of Cassandra TTL

2021-09-14 Thread raman gugnani
Thanks Eric for the update. On Tue, 14 Sept 2021 at 16:50, Erick Ramirez wrote: > You'll need to write an ETL app (most common case is with Spark) to scan > through the existing data and update it with a new TTL. You'll need to make > sure that the ETL job is throttled

Re: TWCS on Non TTL Data

2021-09-14 Thread Jeff Jirsa
>this way the read will scan 10s of sstables instead of hundreds today. Does >it sound reasonable? > > 10s is better than hundreds, but it's still a lot. > >1. Is there a recommended size of a window bucket in terms of disk >space? > > When I wrote it, I wro

TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
Hi I have a table that stores time series data, the data is not TTLed since we want to retain the data for the foreseeable future, and there are no updates or deletes. (deletes could happens rarely in case some scrambled data reached the table, but its extremely rare). Usually we do constant wri

Re: Change of Cassandra TTL

2021-09-14 Thread Erick Ramirez
You'll need to write an ETL app (most common case is with Spark) to scan through the existing data and update it with a new TTL. You'll need to make sure that the ETL job is throttled down so it doesn't overload your production cluster. Cheers! >

Change of Cassandra TTL

2021-09-14 Thread raman gugnani
HI all, 1. I have a table with default_time_to_live = 31536000 (1 year) . We want it to reduce the value to 7884000 (3 months). If we alter the table , is there a way to update the existing data? 1. I have a table without TTL we want to add TTL = 7884000 (3 months) on the table. If we alter

Re: various TTL datas in one table (TWCS)

2020-10-28 Thread Jeff Jirsa
ombstone_compaction': true, 'tombstone_threshold' : 0.05, > 'tombstone_compaction_interval' : 21600 } > AND gc_grace_seconds = 600 > > Apache Cassandra Version 3.11.4 > > >> 2020. 10. 29. 12:26, Jeff Jirsa 작성: >> >> Works but requires

Re: various TTL datas in one table (TWCS)

2020-10-28 Thread Eunsu Kim
URS', 'compaction_window_size' : 12, 'unchecked_tombstone_compaction': true, 'tombstone_threshold' : 0.05, 'tombstone_compaction_interval' : 21600 } AND gc_grace_seconds = 600 Apache Cassandra Version 3.11.4 > 2020. 10. 29. 12:26, Jeff Jirsa 작성: > > Works but requ

Re: various TTL datas in one table (TWCS)

2020-10-28 Thread Jeff Jirsa
Works but requires you to enable tombstone compaction subproperties if you need to purge the 2w ttl data before the highest ttl time you chose > On Oct 28, 2020, at 5:58 PM, Eunsu Kim wrote: > > Hello, > > I have a table with a default TTL(2w). I'm using TWCS(window s

various TTL datas in one table (TWCS)

2020-10-28 Thread Eunsu Kim
Hello, I have a table with a default TTL(2w). I'm using TWCS(window size : 12h) on the recommendation of experts. This table is quite big, high WPS. I would like to insert data different TTL from the default in this table according to the type of data. About four different TTLs (4w, 6

data model for TWCS+TTL

2020-06-04 Thread Arvinder Dhillon
Hi eveyone, In our use-case, we need to insert 200 millions rows per day. By default we need to retain data for 10 days unless a certain condition is matched from client within same day(in that case we need to update ONE column and set ttl to 1 day). In 98% of cases we will find that match and 2

Re: TTL on UDT

2019-12-09 Thread Carl Mueller
ion, thus > applying ttl() on them makes sense. I'm not sure however if the CQL parser > allows this syntax > > On Mon, Dec 9, 2019 at 9:13 PM Carl Mueller > wrote: > >> I could be wrong, but UDTs I think are written (and overwritten) as one >> unit, so the notion of a

Re: TTL on UDT

2019-12-09 Thread DuyHai Doan
It depends on.. Latest version of Cassandra allows unfrozen UDT. The individual fields of UDT are updated atomically and they are stored effectively in distinct physical columns inside the partition, thus applying ttl() on them makes sense. I'm not sure however if the CQL parser allows this s

Re: TTL on UDT

2019-12-09 Thread Carl Mueller
I could be wrong, but UDTs I think are written (and overwritten) as one unit, so the notion of a TTL on a UDT field doesn't exist, the TTL is applied to the overall structure. Think of it like a serialized json object with multiple fields. To update a field they deserialize the json,

TTL on UDT

2019-12-03 Thread Mark Furlong
When I run the command 'select ttl(udt_field) from table; I'm getting an error 'InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot use selection function ttl on collections"'. How can I get the TTL from a UDT field? Mark Furlong [cid:

Re: How to query TTL on collections ?

2019-06-21 Thread Maxim Parkachov
oment I have two (actually more) "normalised" tables, which have data as separate columns. But the use case, actually, requires to read all items every time product is queried, thus move to collection to reduce amount of queries. Moreover, items in collection is append only and expires usi

Re: How to query TTL on collections ?

2019-06-20 Thread Alain RODRIGUEZ
20: {csn: 200, name: 'item200'}} ``` Furthermore, you cannot query the TTL for a single item in a collection, and as distinct columns can have distinct TTLs, you cannot query the TTL for the whole map (collection). As you cannot get the TTL for the whole thing, nor query a single item of

How to query TTL on collections ?

2019-06-19 Thread Maxim Parkachov
Hi everyone, I'm struggling to understand how can I query TTL on the row in collection ( Cassandra 3.11.4 ). Here is my schema: CREATE TYPE item ( csn bigint, name text ); CREATE TABLE products ( product_id bigint PRIMARY KEY, items map> ); And I'm creating records with

RE: [EXTERNAL] Re: Default TTL on CF

2019-03-14 Thread Durity, Sean R
I spent a month of my life on similar problem... There wasn't an easy answer, but this is what I did #1 - Stop the problem from growing further. Get new inserts using a TTL (or set the default on the table so they get it). App team had to do this one. #2 - Delete any data that should al

Re: Default TTL on CF

2019-03-14 Thread Nick Hatfield
te it >>> >>> >>> -- >>> Jeff Jirsa >>> >>> >>>> On Mar 14, 2019, at 1:16 PM, Nick Hatfield >>>> >>>> wrote: >>>> >>>> Hello, >>>> >>>> Can anyone tell m

Re: Default TTL on CF

2019-03-14 Thread Jeff Jirsa
only impacts newly written data >> >> If you need to change the expiration time on existing data, you must >> update it >> >> >> -- >> Jeff Jirsa >> >> >>> On Mar 14, 2019, at 1:16 PM, Nick Hatfield >>> wrote: >>&

Re: Default TTL on CF

2019-03-14 Thread Nick Hatfield
Nick Hatfield >>wrote: >> >> Hello, >> >> Can anyone tell me if setting a default TTL will affect existing data? >>I would like to enable a default TTL and have cassandra add that TTL to >>any rows that don¹t currently have a TTL set. >> >> T

Re: Default TTL on CF

2019-03-14 Thread Jeff Jirsa
ote: > > Hello, > > Can anyone tell me if setting a default TTL will affect existing data? I > would like to enable a default TTL and have cassandra add that TTL to any > rows that don’t currently have a TTL set. > > Thanks, -

Default TTL on CF

2019-03-14 Thread Nick Hatfield
Hello, Can anyone tell me if setting a default TTL will affect existing data? I would like to enable a default TTL and have cassandra add that TTL to any rows that don’t currently have a TTL set. Thanks,

Re: TTL documentation

2019-02-01 Thread Jeff Jirsa
Those aren’t the project docs, they’re datastax’s docs, but that line makes no sense. I assume they meant that once a column reaches its TTL it is treated as a tombstone. That’s per column and not the entire table. -- Jeff Jirsa > On Feb 1, 2019, at 1:47 AM, Enrico Cavallin wrote: >

TTL documentation

2019-02-01 Thread Enrico Cavallin
Hi all, I cannot understand what this statement means: <> in https://docs.datastax.com/en/dse/6.7/cql/cql/cql_using/useExpire.html I have already done some tests with TTL set on columns, rows and default on table and all seems in line with the logic: an already written row/value maintai

Re: High CPU usage on reading single row with Set column with short TTL

2019-01-28 Thread Jeff Jirsa
t; Hello, >> >> We have noticed CPU usage spike after several minutes of consistent load >> when querying: >> - a single column of set type (same partition key) >> - relatively frequently (couple hundred times per second, for comparison, we >> do an or

Re: High CPU usage on reading single row with Set column with short TTL

2019-01-28 Thread Jonathan Haddad
ady with much bigger payloads) > - with the elements in the set having a very short TTL ( single digit > seconds) and several inserts per second > - gc_grace set to 0 (that should remove hints and should prevent > tombstones) > - reads and writes are using local quorum consistency

High CPU usage on reading single row with Set column with short TTL

2019-01-28 Thread Tom Wollert
) - with the elements in the set having a very short TTL ( single digit seconds) and several inserts per second - gc_grace set to 0 (that should remove hints and should prevent tombstones) - reads and writes are using local quorum consistency - replication factor of 3 (on 4 node setup) I am struggling

Re: TTL tombstones in Cassandra using LCS are cretaed in the same level data TTLed data?

2018-10-04 Thread Gabriel Giussi
a las 14:11, Alain RODRIGUEZ () escribió: > Hello Gabriel, > > Another clue to explore would be to use the TTL as a default value if >> that's a good fit. TTLs set at the table level with 'default_time_to_live' >> should not generate any tombstone at all in C*3

Re: TTL tombstones in Cassandra using LCS are cretaed in the same level data TTLed data?

2018-09-27 Thread Alain RODRIGUEZ
Hello Gabriel, Another clue to explore would be to use the TTL as a default value if > that's a good fit. TTLs set at the table level with 'default_time_to_live' > should not generate any tombstone at all in C*3.0+. Not tested on my hand, > but I read about this. >

TTL tombstones in Cassandra using LCS are cretaed in the same level data TTLed data?

2018-09-25 Thread Gabriel Giussi
I'm using LCS and a relatively large TTL of 2 years for all inserted rows and I'm concerned about the moment at wich C* would drop the corresponding tombstones (neither explicit deletes nor updates are being performed). >From [Missing Manual for Leveled Compaction Str

Re: Too many tombstones using TTL

2018-09-07 Thread Charulata Sharma (charshar)
Thanks, Charu From: Python_Max Reply-To: "user@cassandra.apache.org" Date: Tuesday, January 16, 2018 at 7:26 AM To: "user@cassandra.apache.org" Subject: Re: Too many tombstones using TTL Thanks for a very helpful reply. Will try to refactor the code accordingly. On Tu

Re: default_time_to_live vs TTL on insert statement

2018-07-12 Thread Nitan Kainth
Okay so it means regular update and any ttl set with write overrides default setting. Which means datastax documentation is incorrect and should be updated. Sent from my iPhone > On Jul 12, 2018, at 9:35 AM, DuyHai Doan wrote: > > To set TTL on a column only and not on the whole CQL

Re: default_time_to_live vs TTL on insert statement

2018-07-12 Thread DuyHai Doan
To set TTL on a column only and not on the whole CQL row, use UPDATE instead: UPDATE USING TTL xxx SET = WHERE partition=yyy On Thu, Jul 12, 2018 at 2:42 PM, Nitan Kainth wrote: > Kurt, > > It is same mentioned on apache docuemtation too, I am not able to find it > right now

Re: default_time_to_live vs TTL on insert statement

2018-07-12 Thread Nitan Kainth
Kurt, It is same mentioned on apache docuemtation too, I am not able to find it right now. But my question is: How to set TTL for a whole column? On Wed, Jul 11, 2018 at 11:36 PM, kurt greaves wrote: > The Datastax documentation is wrong. It won't error, and it shouldn't. If >

Re: default_time_to_live vs TTL on insert statement

2018-07-11 Thread kurt greaves
gt; statement: > > You can set a default TTL for an entire table by setting the table's > default_time_to_live > <https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateTable.html#tabProp__cqlTableDefaultTTL> > property. If you try to set a TTL for a specific colu

Re: default_time_to_live vs TTL on insert statement

2018-07-11 Thread Nitan Kainth
Hi DuyHai, Could you please explain in what case C* will error based on documented statement: You can set a default TTL for an entire table by setting the table's default_time_to_live <https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateTable.html#tabProp__cqlTableDe

Re: default_time_to_live vs TTL on insert statement

2018-07-11 Thread DuyHai Doan
default_time_to_live <https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateTable.html#tabProp__cqlTableDefaultTTL> property applies if you don't specify any TTL on your CQL statement However you can always override the default_time_to_live <https://docs.datastax.com/

default_time_to_live vs TTL on insert statement

2018-07-11 Thread Nitan Kainth
Hi, As per document: https://docs.datastax.com/en/cql/3.3/cql/cql_using/useExpireExample.html - You can set a default TTL for an entire table by setting the table's default_time_to_live <https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateTa

adding a non-used column just to debug ttl

2018-07-07 Thread onmstester onmstester
Hi, Because of "Cannot use selection function ttl on PRIMARY KEY part type", i'm adding a boolean column to table with no non-primary key columns, i'm just worried about someday i would need debugging on ttl! is this a right approach? anyone else is doing this? Sent using Zoho Mail

CDC and TTL

2018-06-18 Thread Joy Gao
Hi all! I recently started to look into Cassandra CDC implementation. One question that occurred to me is how/if TTL is handled for CDC. For example, If I insert some data with TTL enabled and expiring in 60 seconds, will CDC be aware of these changes 60 seconds later when the TTL expired? If not

Re: estimated number of keys vs ttl

2018-05-23 Thread Eric Stevens
I believe that key estimates won't immediately respond to expired TTL, Not until after compaction has completely dropped the records (which will include subtle logic related to gc_grace and partitions with data in multiple SSTables). On Wed, May 23, 2018 at 6:18 AM Rahul Singh wrote: >

Re: estimated number of keys vs ttl

2018-05-23 Thread Rahul Singh
If the TTL actually reduces the key count , should. It’s possible to TTL a row from a partition but not the whole partition. 1 key = 1 partition != 1 row != 1 cell -- Rahul Singh rahul.si...@anant.us Anant Corporation On May 23, 2018, 6:07 AM -0500, Grzegorz Pietrusza , wrote: > Hi >

estimated number of keys vs ttl

2018-05-23 Thread Grzegorz Pietrusza
Hi I'm using tablestats to get estimated number of partitioning keys. In my case all writes are done with TTL of a few days. Is the key count decreased when TTL hits? Regards Grzegorz

Re: multiple tables vs. partitions and TTL

2018-02-01 Thread Alain RODRIGUEZ
> > I have a design issue here: > We want to store bigger amounts of data (> 30mio rows containing blobs) > which will be deleted depending on the type > of data on a monthly base (not in the same order as the data entered the > system). > Some data would survive for two month onl

Re: multiple tables vs. partitions and TTL

2018-02-01 Thread James Shaw
3-5 years. > > The choice now is to have one table only with TTL per partition and > partitions per deletion month (when the data should be deleted) > which will allow a single delete command, followed by a compaction > or alternatively to have multiple tables (one per month whe

multiple tables vs. partitions and TTL

2018-02-01 Thread Marcus Haarmann
her data for 3-5 years. The choice now is to have one table only with TTL per partition and partitions per deletion month (when the data should be deleted) which will allow a single delete command, followed by a compaction or alternatively to have multiple tables (one per month when the delet

Re: Too many tombstones using TTL

2018-01-16 Thread Python_Max
at 2:12 PM Python_Max wrote: > >> Hello. >> >> I was planning to remove a row (not partition). >> >> Most of the tombstones are seen in the use case of geographic grid with >> X:Y as partition key and object id (timeuuid) as clustering key where >> obj

Re: Too many tombstones using TTL

2018-01-16 Thread Alexander Dejanovski
e space on disk. On Tue, Jan 16, 2018 at 2:12 PM Python_Max wrote: > Hello. > > I was planning to remove a row (not partition). > > Most of the tombstones are seen in the use case of geographic grid with > X:Y as partition key and object id (timeuuid) as clustering key where >

Re: Too many tombstones using TTL

2018-01-16 Thread Python_Max
Hello. I was planning to remove a row (not partition). Most of the tombstones are seen in the use case of geographic grid with X:Y as partition key and object id (timeuuid) as clustering key where objects could be temporary with TTL about 10 hours or fully persistent. When I select all objects

Re: Too many tombstones using TTL

2018-01-16 Thread Alexander Dejanovski
d from disk nor exchanged between replicas. But that's of course if your use case allows to delete full partitions. We usually model so that we can restrict our reads to live data. If you're creating time series, your clustering key should include a timestamp, which you can use to avoid reading

Re: Too many tombstones using TTL

2018-01-16 Thread Python_Max
d theoretically be set for different > cells of the same row. And one TTLed cell could be shadowing another cell > that has no TTL (say you forgot to set a TTL and set one afterwards by > performing an update), or vice versa. > One cell could also be missing from a node without Cassand

Re: Too many tombstones using TTL

2018-01-12 Thread Alexander Dejanovski
Hi, As DuyHai said, different TTLs could theoretically be set for different cells of the same row. And one TTLed cell could be shadowing another cell that has no TTL (say you forgot to set a TTL and set one afterwards by performing an update), or vice versa. One cell could also be missing from a

Re: Too many tombstones using TTL

2018-01-12 Thread Python_Max
Thank you for response. I know about the option of setting TTL per column or even per item in collection. However in my example entire row has expired, shouldn't Cassandra be able to detect this situation and spawn a single tombstone for entire row instead of many? Is there any reason not

Re: Too many tombstones using TTL

2018-01-11 Thread kurt greaves
You should be able to avoid querying the tombstones if it's time series data. Using TWCS just make sure you don't query data that you know is expired (assuming you have the time component in your clustering key)​.

Re: Too many tombstones using TTL

2018-01-10 Thread DuyHai Doan
"The question is why Cassandra creates a tombstone for every column instead of single tombstone per row?" --> Simply because technically it is possible to set different TTL value on each column of a CQL row On Wed, Jan 10, 2018 at 2:59 PM, Python_Max wrote: > Hello, C* users an

Too many tombstones using TTL

2018-01-10 Thread Python_Max
tems(a text, b text, c1 text, c2 text, c3 text, primary key (a, b)); cqlsh> insert into items(a,b,c1,c2,c3) values('AAA', 'BBB', 'C111', 'C222', 'C333') using ttl 60; bash$ nodetool flush bash$ sleep 60 bash$ nodetool compact test_ttl items bash$ ss

Re: Global TTL vs Insert TTL

2017-02-02 Thread Romain Hardouin
Default TTL is nice to provide information on tables for ops guys. I mean we know that data in such tables are ephemeral at a glance. Le Mercredi 1 février 2017 21h47, Carlos Rolo a écrit : Awsome to know this! Thanks Jon and DuyHai! Regards, Carlos Juzarte RoloCassandra Consultant

Re: Global TTL vs Insert TTL

2017-02-01 Thread Carlos Rolo
/in/carlosjuzarterolo>* Mobile: +351 918 918 100 www.pythian.com On Wed, Feb 1, 2017 at 6:57 PM, Jonathan Haddad wrote: > The optimization is there. The entire sstable can be dropped but it's not > because of the default TTL. The default TTL only applies if a TTL isn't > specified

Re: Global TTL vs Insert TTL

2017-02-01 Thread Jonathan Haddad
The optimization is there. The entire sstable can be dropped but it's not because of the default TTL. The default TTL only applies if a TTL isn't specified explicitly. The default TTL can't be used to drop a table automatically since it can be overridden at insert time. Check o

Re: Global TTL vs Insert TTL

2017-02-01 Thread DuyHai Doan
: > This is incorrect, there's no optimization used that references the table > level TTL setting. The max local deletion time is stored in table > metadata. See > org.apache.cassandra.io.sstable.metadata.StatsMetadata#maxLocalDeletionTime > in the Cassandra 3.0 branch.The d

Re: Global TTL vs Insert TTL

2017-02-01 Thread Jonathan Haddad
This is incorrect, there's no optimization used that references the table level TTL setting. The max local deletion time is stored in table metadata. See org.apache.cassandra.io.sstable.metadata.StatsMetadata#maxLocalDeletionTime in the Cassandra 3.0 branch.The default ttl is stored

Re: Global TTL vs Insert TTL

2017-02-01 Thread DuyHai Doan
Global TTL is better than dynamic runtime TTL Why ? Because Global TTL is a table property and Cassandra can perform optimization when compacting. For example if it can see than the maxTimestamp of an SSTable is older than the table Global TTL, the SSTable can be entirely dropped during

Re: Global TTL vs Insert TTL

2017-02-01 Thread Cogumelos Maravilha
ed disk >> space? >> >> There is a very recent topic about that in the mailing list, >> look for "Time series data model and tombstones". I believe >> DuyHai answer your question there with more details :). >> >> *tl;dr:* &

Re: Global TTL vs Insert TTL

2017-02-01 Thread Carlos Rolo
hanks in advance. >> >> On 31-01-2017 11:27, Alain RODRIGUEZ wrote: >> >> Is there a overhead using line by line option or wasted disk space? >>> >>> There is a very recent topic about that in the mailing list, look for "Time >> series data mode

Re: Global TTL vs Insert TTL

2017-02-01 Thread Alain RODRIGUEZ
by line option or wasted disk space? >> >> There is a very recent topic about that in the mailing list, look for "Time > series data model and tombstones". I believe DuyHai answer your question > there with more details :). > > *tl;dr:* > > Yes, if you know

Re: Global TTL vs Insert TTL

2017-01-31 Thread Cogumelos Maravilha
isk space? > > There is a very recent topic about that in the mailing list, look for > "Time series data model and tombstones". I believe DuyHai answer your > question there with more details :). > > *tl;dr:* > > Yes, if you know the TTL in advance, and it is fixe

Re: Global TTL vs Insert TTL

2017-01-31 Thread Alain RODRIGUEZ
> > Is there a overhead using line by line option or wasted disk space? > > There is a very recent topic about that in the mailing list, look for "Time series data model and tombstones". I believe DuyHai answer your question there with more details :). *tl;dr:* Yes,

Global TTL vs Insert TTL

2017-01-31 Thread Cogumelos Maravilha
Hi I'm just wondering what option is fastest: Global:***create table xxx (.|AND |**|default_time_to_live = |**|XXX|**|;|**||and**UPDATE xxx USING TTL XXX;* Line by line: *INSERT INTO xxx (...USING TTL xxx;* Is there a overhead using line by line option or wasted disk

Re: Insert with both TTL and timestamp behavior

2016-12-30 Thread Jeff Jirsa
Your last sentence is correct - TWCS and dtcs add meaning (date/timestamp) to the long writetime that the rest of Cassandra ignores. If you're trying to backload data, you'll need to calculate the TTL yourself per write like you calculate the writetime. The TTL behavior doesn'

Re: Insert with both TTL and timestamp behavior

2016-12-28 Thread DuyHai Doan
Indeed, the TTL is computed based on LOCAL timestamp of the server and not based on the PROVIDED timestamp by the client ... (according to Mastering Apache Cassandra, 2nd edition, Nishant Neeraj, PackPublishing) On Wed, Dec 28, 2016 at 10:15 PM, Voytek Jarnot wrote: > >It's not clea

Re: Insert with both TTL and timestamp behavior

2016-12-28 Thread Voytek Jarnot
>It's not clear to me why for your use case you would want to manipulate the timestamps as you're loading the records unless you're concerned about conflicting writes getting applied in the correct order. Simple use-case: want to load historical data, want to use TWCS, want to

  1   2   3   4   >