Re: leveled compaction and tombstoned data

2012-11-11 Thread Radim Kolar
I would be careful with the patch that was referred to above, it hasn't been reviewed, and from a glance it appears that it will cause an infinite compaction loop if you get more than 4 SSTables at max size. it will, you need to setup max sstable size correctly.

Re: leveled compaction and tombstoned data

2012-11-11 Thread Sylvain Lebresne
On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo wrote: > No it does not exist. Rob and I might start a donation page and give > the money to whoever is willing to code it. If someone would write a > tool that would split an sstable into 4 smaller sstables (even an > offline command line tool) S

Re: leveled compaction and tombstoned data

2012-11-10 Thread Edward Capriolo
No it does not exist. Rob and I might start a donation page and give the money to whoever is willing to code it. If someone would write a tool that would split an sstable into 4 smaller sstables (even an offline command line tool) I would paypal them a hundo. On Sat, Nov 10, 2012 at 1:10 PM, Aaron

Re: leveled compaction and tombstoned data

2012-11-10 Thread Aaron Turner
Nope. I think at least once a week I hear someone suggest one way to solve their problem is to "write an sstablesplit tool". I'm pretty sure that: Step 1. Write sstablesplit Step 2. ??? Step 3. Profit! On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ wrote: > @Rob Coli > > Does the "sstable

Re: leveled compaction and tombstoned data

2012-11-10 Thread Alain RODRIGUEZ
@Rob Coli Does the "sstablesplit" function exists somewhere ? 2012/11/10 Jim Cistaro > For some of our clusters, we have taken the periodic major compaction > route. > > There are a few things to consider: > 1) Once you start major compacting, depending on data size, you may be > committed to

Re: leveled compaction and tombstoned data

2012-11-10 Thread Jim Cistaro
For some of our clusters, we have taken the periodic major compaction route. There are a few things to consider: 1) Once you start major compacting, depending on data size, you may be committed to doing it periodically because you create one big file that will take forever to naturally compact aga

Re: leveled compaction and tombstoned data

2012-11-09 Thread Rob Coli
On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss wrote: > my question is would leveled compaction help to get rid of the tombstoned > data faster than size tiered, and therefore reduce the disk space usage? You could also... 1) run a major compaction 2) code up sstablesplit 3) profit! This meth

Re: leveled compaction and tombstoned data

2012-11-09 Thread Ben Coverston
The rules for tombstone eviction are as follows (regardless of your compaction strategy): 1. gc_grace must be expired, and 2. No other row fragments can exist for the row that aren't also participating in the compaction. For LCS, there is no 'rule' that the tombstones can only be evicted at the h

Re: leveled compaction and tombstoned data

2012-11-09 Thread Mina Naguib
On 2012-11-08, at 1:12 PM, B. Todd Burruss wrote: > we are having the problem where we have huge SSTABLEs with tombstoned data in > them that is not being compacted soon enough (because size tiered compaction > requires, by default, 4 like sized SSTABLEs). this is using more disk space > th

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
@ben, thx, we will be deploying 2.2.1 of DSE soon and will try to setup a traffic sampling node so we can test leveled compaction. we essentially keep a rolling window of data written once. it is written, then after N days it is deleted, so it seems that leveled compaction should help On Thu, No

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
thanks for the links! i had forgotten about live sampling On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams wrote: > On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: >> There are also ways to bring up a test node and just run Level Compaction on >> that. Wish I had a URL handy, but hopefull

Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
Also to answer your question, LCS is well suited to workloads where overwrites and tombstones come into play. The tombstones are _much_ more likely to be merged with LCS than STCS. I would be careful with the patch that was referred to above, it hasn't been reviewed, and from a glance it appears t

Re: leveled compaction and tombstoned data

2012-11-08 Thread Brandon Williams
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: > There are also ways to bring up a test node and just run Level Compaction on > that. Wish I had a URL handy, but hopefully someone else can find it. This rather handsome fellow wrote a blog about it: http://www.datastax.com/dev/blog/whats-new

Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
http://www.datastax.com/docs/1.1/operations/tuning#testing-compaction-and-compression Write Survey mode. After you have it up and running you can modify the column family mbean to use LeveledCompactionStrategy on that node to see how your hardware/load fares with LCS. On Thu, Nov 8, 2012 at 11:

Re: leveled compaction and tombstoned data

2012-11-08 Thread Jeremy Hanna
LCS works well in specific circumstances, this blog post gives some good considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction On Nov 8, 2012, at 1:33 PM, Aaron Turner wrote: > "kill performance" is relative. Leveled Compaction basically costs 2x disk > IO. Look at

Re: leveled compaction and tombstoned data

2012-11-08 Thread Aaron Turner
"kill performance" is relative. Leveled Compaction basically costs 2x disk IO. Look at iostat, etc and see if you have the headroom. There are also ways to bring up a test node and just run Level Compaction on that. Wish I had a URL handy, but hopefully someone else can find it. Also, if you'r

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are running Datastax enterprise and cannot patch it. how bad is "kill performance"? if it is so bad, why is it an option? On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar wrote: > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a): > >> my question is would leveled compaction help to get rid of the

Re: leveled compaction and tombstoned data

2012-11-08 Thread Radim Kolar
Dne 8.11.2012 19:12, B. Todd Burruss napsal(a): my question is would leveled compaction help to get rid of the tombstoned data faster than size tiered, and therefore reduce the disk space usage? leveled compaction will kill your performance. get patch from jira for maximum sstable size per CF

leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are having the problem where we have huge SSTABLEs with tombstoned data in them that is not being compacted soon enough (because size tiered compaction requires, by default, 4 like sized SSTABLEs). this is using more disk space than we anticipated. we are very write heavy compared to reads, an