Dne 25.12.2011 20:58, Peter Schuller napsal(a):
Read Count: 68844
[snip]
why reported bloom filter FP ratio is not counted like this
10/68844.0
0.00014525594096798558
Because the read count is total amount of reads to the CF, while the
bloom filter is per sstable. The number
I would go with composites because cassandra can do better validation. Also
with composites you have a few more options for your slice start; key
inclusive start key exclusive etc. If you are going to concat, tilde is a
better option then : because of It's ASCII value.
On Wednesday, December 21, 2
> but reported ratio is Bloom Filter False Ratio: 0.00495 which is higher
> than my computed ratio 0.000145. If you were true than reported ratio should
> be lower then mine computed from CF reads because there are more reads to
> sstables then to CF.
The ratio is the ratio of false positives to
Hello everybody.
I am developer of financial-related application, and I'm currently evaluating
various nosql databases for our current goal: storing various views which show
state of the system in different aspects after each transaction.
The write load seems to be bigger than typical SQL databas
my missunderstanding of FP ratio was based on assumption that ratio is
counted from node start, while it is getRecentBloomFilterFalseRatio()
> I don't understand how you reached that conclusion.
On my nodes most memory is consumed by bloom filters. Also 1.0 creates
larger bloom filters than 0.
If 3 rows in a column family need to be read together 'always', is it
preferable to just merge them in 1 row using composite col names(instead of
keeping in 3 rows) ? Does this improve read performance, anyway ?
If node is low on memory 0.95+ heap used it can do:
1. stop repair
2. stop largest compaction
3. reduce number of compaction slots
4. switch compaction to single threaded
flushing largest memtable/ cache reduce is not enough
>> I don't understand how you reached that conclusion.
>
> On my nodes most memory is consumed by bloom filters. Also 1.0 creates
The point is that just because that's the problem you have, doesn't
mean the default is wrong, since it quite clearly depends on use-case.
If your relative amounts of r
> If node is low on memory 0.95+ heap used it can do:
>
> 1. stop repair
> 2. stop largest compaction
> 3. reduce number of compaction slots
> 4. switch compaction to single threaded
>
> flushing largest memtable/ cache reduce is not enough
Note that the "emergency" flushing is just a stop-gap. Yo
I suggest you describe exactly what the problem is you have and why you
think stopping compaction/repair is the appropriate solution.
compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to
heap, node logs messages like:
WARN [ScheduledTasks:1] 2011-12-27 00:20:57,972 GCInspector
> I suggest you describe exactly what the problem is you have and why you
> think stopping compaction/repair is the appropriate solution.
>
> compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to heap,
> node logs messages like:
I don't know what you are basing that on. It seems unli
I need to store data of all activities by user's followies in single row. I
am trying to do that making use of composite column names in a single user
specific row named 'rowX'.
On any activity by a user's followie on an item, a column is stored in
'rowX'. The column has a composite type column na
I'm pleased to announce Peregrine 0.5.0 - a new map reduce framework
optimized
for iterative and pipelined map reduce jobs.
http://peregrine_mapreduce.bitbucket.org/
This originally started off with some internal work at Spinn3r to build a
fast
and efficient Pagerank implementation. We realized
13 matches
Mail list logo