On Mon, May 14, 2018 at 11:04 AM, Lucas Benevides < lu...@maurobenevides.com.br> wrote:
> Thank you Jeff Jirsa by your comments, > > How can we do this: "fix this by not scheduling the major compaction > until we know all of the sstables in the window are available to be > compacted"? > > Would require a change to TWCS itself. Right here where we grab not-currently-compacting sstables ( https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L110 ), we'd also grab the compacting set, and if the candidate sstables for the task overlapped with the same window as the compacting sstables (respecting repaired/unrepaired/pending-repaired sets), then we'd skip compacting until the previous compactions finished. > About the column-family schema, I had to customize the cassandra-stress > tool so that it could create a reasonable number of rows per partition. In > the default behavior it keeps creating repeated clustering keys for each > partition, and so most data get updated instead of inserted. > A similar customization may be useful to create partitions that are narrowly bucketed into fixed sized time windows (which is a common and typical schema in IOT use cases). - Jeff > > Lucas B. Dias > > 2018-05-14 14:03 GMT-03:00 Jeff Jirsa <jji...@gmail.com>: > >> Interesting! >> >> I suspect I know what the increased disk usage in TWCS, and it's a >> solvable problem, the problem is roughly something like this: >> - Window 1 has sstables 1, 2, 3, 4, 5, 6 >> - We start compacting 1, 2, 3, 4 (using STCS-in-TWCS first window) >> - The TWCS window rolls over >> - We flush (sstable 7), and trigger the TWCS window major compaction, >> which starts compacting 5, 6, 7 + any other sstable from that window >> - If the first compaction (1,2,3,4) has finished by the time sstable 7 is >> flushed, we'll include it's result in that compaction, if it doesn't we'll >> have to do the major compaction twice to guarantee we have exactly one >> sstable per window, which will temporarily increase disk space >> >> We can likely fix this by not scheduling the major compaction until we >> know all of the sstables in the window are available to be compacted. >> >> Also your data model is probably typical, but not well suited for time >> series cases - if you find my 2016 Cassandra Summit TWCS talk (it's on >> youtube), I mention aligning partition keys to TWCS windows, which involves >> adding a second component to the partition key. This is hugely important in >> terms of making sure TWCS data expires quickly and avoiding having to read >> from more than one TWCS window at a time. >> >> >> - Jeff >> >> >> >> On Mon, May 14, 2018 at 7:12 AM, Lucas Benevides < >> lu...@maurobenevides.com.br> wrote: >> >>> Dear community, >>> >>> I want to tell you about my paper published in a conference in March. >>> The title is " NoSQL Database Performance Tuning for IoT Data - >>> Cassandra Case Study" and it is available (not for free) in >>> http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10 >>> .5220/0006782702770284 . >>> >>> TWCS is used and compared with DTCS. >>> >>> I hope you can download it, unfortunately I cannot send copies as the >>> publisher has its copyright. >>> >>> Lucas B. Dias >>> >>> >>> >> >