Re: reported bloom filter FP ratio

2011-12-26 Thread Radim Kolar
Dne 25.12.2011 20:58, Peter Schuller napsal(a): Read Count: 68844 [snip] why reported bloom filter FP ratio is not counted like this 10/68844.0 0.00014525594096798558 Because the read count is total amount of reads to the CF, while the bloom filter is per sstable. The number

Re: Doubts related to composite type column names/values

2011-12-26 Thread Edward Capriolo
I would go with composites because cassandra can do better validation. Also with composites you have a few more options for your slice start; key inclusive start key exclusive etc. If you are going to concat, tilde is a better option then : because of It's ASCII value. On Wednesday, December 21, 2

Re: reported bloom filter FP ratio

2011-12-26 Thread Peter Schuller
> but reported ratio is  Bloom Filter False Ratio: 0.00495 which is higher > than my computed ratio 0.000145. If you were true than reported ratio should > be lower then mine computed from CF reads because there are more reads to > sstables then to CF. The ratio is the ratio of false positives to

Newbie question about writer/reader consistency

2011-12-26 Thread Vladimir Mosgalin
Hello everybody. I am developer of financial-related application, and I'm currently evaluating various nosql databases for our current goal: storing various views which show state of the system in different aspects after each transaction. The write load seems to be bigger than typical SQL databas

Re: reported bloom filter FP ratio

2011-12-26 Thread Radim Kolar
my missunderstanding of FP ratio was based on assumption that ratio is counted from node start, while it is getRecentBloomFilterFalseRatio() > I don't understand how you reached that conclusion. On my nodes most memory is consumed by bloom filters. Also 1.0 creates larger bloom filters than 0.

Merging 3 rows that are mostly read together from CF into single rows with composite col names ?

2011-12-26 Thread Asil Klin
If 3 rows in a column family need to be read together 'always', is it preferable to just merge them in 1 row using composite col names(instead of keeping in 3 rows) ? Does this improve read performance, anyway ?

better anti OOM

2011-12-26 Thread Radim Kolar
If node is low on memory 0.95+ heap used it can do: 1. stop repair 2. stop largest compaction 3. reduce number of compaction slots 4. switch compaction to single threaded flushing largest memtable/ cache reduce is not enough

Re: reported bloom filter FP ratio

2011-12-26 Thread Peter Schuller
>> I don't understand how you reached that conclusion. > > On my nodes most memory is consumed by bloom filters. Also 1.0 creates The point is that just because that's the problem you have, doesn't mean the default is wrong, since it quite clearly depends on use-case. If your relative amounts of r

Re: better anti OOM

2011-12-26 Thread Peter Schuller
> If node is low on memory 0.95+ heap used it can do: > > 1. stop repair > 2. stop largest compaction > 3. reduce number of compaction slots > 4. switch compaction to single threaded > > flushing largest memtable/ cache reduce is not enough Note that the "emergency" flushing is just a stop-gap. Yo

Re: better anti OOM

2011-12-26 Thread Radim Kolar
I suggest you describe exactly what the problem is you have and why you think stopping compaction/repair is the appropriate solution. compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to heap, node logs messages like: WARN [ScheduledTasks:1] 2011-12-27 00:20:57,972 GCInspector

Re: better anti OOM

2011-12-26 Thread Peter Schuller
> I suggest you describe exactly what the problem is you have and why you > think stopping compaction/repair is the appropriate solution. > > compacting 41.7 GB CF with about 200 millions rows adds - 600 MB to heap, > node logs messages like: I don't know what you are basing that on. It seems unli

Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

2011-12-26 Thread Aditya
I need to store data of all activities by user's followies in single row. I am trying to do that making use of composite column names in a single user specific row named 'rowX'. On any activity by a user's followie on an item, a column is stored in 'rowX'. The column has a composite type column na

Peregrine: A new map reduce framework for iterative/pipelined jobs.

2011-12-26 Thread Kevin Burton
I'm pleased to announce Peregrine 0.5.0 - a new map reduce framework optimized for iterative and pipelined map reduce jobs. http://peregrine_mapreduce.bitbucket.org/ This originally started off with some internal work at Spinn3r to build a fast and efficient Pagerank implementation. We realized