On Tue, May 1, 2012 at 10:00 PM, Oleg Proudnikov wrote:
> There is this note regarding major compaction in the tuning guide:
>
> "once you run a major compaction, automatic minor compactions are no longer
> triggered frequently forcing you to manually run major compactions on a
> routine
> basis"
On Tue, May 1, 2012 at 9:06 PM, Edward Capriolo wrote:
> Also there are some tickets in JIRA to impose a max sstable size and
> some other related optimizations that I think got stuck behind levelDB
> in coolness factor. Not every use case is good for leveled so adding
> more tools and optimizatio
On Tue, May 1, 2012 at 6:07 PM, Rob Coli wrote:
>
> The primary differences, as I understand it, are that the index
> performance and bloom filter false positive rate for your One Big File
> are worse. First, you are more likely to get a bloom filter false
> positive due to the intrinsic degradat
Henrik Schröder gmail.com> writes:
> But what's the difference between doing an extra read from that
> One Big File, than doing an extra read from whatever SSTable
> happen to be largest in the course of automatic minor compaction?
There is this note regarding major compaction in the tuning gu
+1
On Tue, May 1, 2012 at 12:06 PM, Edward Capriolo wrote:
> Also there are some tickets in JIRA to impose a max sstable size and
> some other related optimizations that I think got stuck behind levelDB
> in coolness factor. Not every use case is good for leveled so adding
> more tools and optimi
Also there are some tickets in JIRA to impose a max sstable size and
some other related optimizations that I think got stuck behind levelDB
in coolness factor. Not every use case is good for leveled so adding
more tools and optimizations of the Size Tiered tables would be
awesome.
On Tue, May 1, 2
On Tue, May 1, 2012 at 4:31 AM, Henrik Schröder wrote:
> But what's the difference between doing an extra read from that One Big
> File, than doing an extra read from whatever SSTable happen to be largest in
> the course of automatic minor compaction?
The primary differences, as I understand it,
I wonder if TieredMergePolicy [1] could be used in Cassandra for compaction?
1.
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
On Tue, May 1, 2012 at 6:38 AM, Edward Capriolo wrote:
> Henrik,
>
> There are use cases where major compaction works well like yours an
Henrik,
There are use cases where major compaction works well like yours and
mine. Essentially cases with a high amount of churn, updates and
deletes we get a lot of benefit from forced tombstone removal in the
form of less physical data.
However we end up with really big sstables that naturally
But what's the difference between doing an extra read from that One Big
File, than doing an extra read from whatever SSTable happen to be largest
in the course of automatic minor compaction?
We have a pretty update-heavy application, and doing a major compaction can
remove up to 30% of the used di
Thank you Aaron.
That explanation cleared things up.
2012/4/30 aaron morton :
> Depends on your definition of significantly, there are a few things to
> consider.
>
> * Reading from SSTables for a request is a serial operation. Reading from 2
> SSTables will take twice as long as 1.
>
> * If the d
Depends on your definition of significantly, there are a few things to
consider.
* Reading from SSTables for a request is a serial operation. Reading from 2
SSTables will take twice as long as 1.
* If the data in the One Big File™ has been overwritten, reading it is a waste
of time. And it w
Exactly, but why would reads be significantly slower over time when
including just one more, although sometimes large, SSTable in the read?
Ji Cheng skrev 2012-04-26 11:11:
I'm also quite interested in this question. Here's my understanding on
this problem.
1. If your workload is append-only,
I'm also quite interested in this question. Here's my understanding on this
problem.
1. If your workload is append-only, doing a major compaction shouldn't
affect the read performance too much, because each row appears in one
sstable anyway.
2. If your workload is mostly updating existing rows, t
14 matches
Mail list logo