RE: MapReduce performance problem

2013-03-07 Thread Kevin Burton
Any comments? It seems to run intermittently on my local VMs and just outright fails on a Joyent Cloud or AWS image. From: Kevin Burton [mailto:rkevinbur...@charter.net] Sent: Tuesday, February 26, 2013 6:36 PM To: 'Jeremiah Peschka'; 'riak-users' Subject: RE: MapReduce performance probl

Re: after raising of n_val all keys exists multiple times in ?keys=true

2013-03-07 Thread Simon Effenberg
Any idea what happened? We have had to remove the riak db and started from scratch to get rid of the ghost keys... On Thu, 7 Mar 2013 11:35:12 +0100 Simon Effenberg wrote: > Now we see only 3 occurrences of the keys. So maybe the reducing of the > n_val could be a problem.. after we removed the

Re: cluster capacity planning, 50M of writes per day

2013-03-07 Thread Wes James
I understand. I was just suggesting different hardware. That's all. wes On Thu, Mar 7, 2013 at 2:20 PM, Paul Peregud wrote: > It is not really a question of saving 55$ per disk - I'm wondering if such > Riak cluster setup is feasible. > > > On Thu, Mar 7, 2013 at 10:14 PM, Wes James wrote: >

Re: cluster capacity planning, 50M of writes per day

2013-03-07 Thread Paul Peregud
It is not really a question of saving 55$ per disk - I'm wondering if such Riak cluster setup is feasible. On Thu, Mar 7, 2013 at 10:14 PM, Wes James wrote: > You can get crucial m4 512 for $395 on amazon.com. ocz are $450. > > > http://www.amazon.com/Crucial-512GB-2-5-Inch-Solid-CT512M4SSD2/d

Re: cluster capacity planning, 50M of writes per day

2013-03-07 Thread Wes James
You can get crucial m4 512 for $395 on amazon.com. ocz are $450. http://www.amazon.com/Crucial-512GB-2-5-Inch-Solid-CT512M4SSD2/dp/B004W2JL3Y I don't know if you can get these out of the USA or not (you are out of USA right?) wes On Thu, Mar 7, 2013 at 1:38 PM, Paul Peregud wrote: > My fir

Re: Delete keys still in Riak db

2013-03-07 Thread John Daily
Thanks for the detailed explanation of what you're seeing. As it turns out, your results are typical. Usually the deletion happens automatically across all nodes, but as you've seen, there's no guarantee of that. Reading a key will trigger read repair to clean up lingering objects, and is effec

cluster capacity planning, 50M of writes per day

2013-03-07 Thread Paul Peregud
My first post to the list. I want to store 20M of bounded FIFO queues (each queue has at most 50 items, 1KB each). Distribution: 80% of updates at 20% of queues. Insertion rate: at most 5K per second, random; 50M per day. Reading rate: at most 10K per second, each read reads all 50 items. Once in

Delete keys still in Riak db

2013-03-07 Thread Daniel Iwan
In our tests we are adding 3000 keys into 3-node Riak db right after nodes have joined. For each key one node reads it and modifies it and another node does the same but also deletes the key when it sees other change (key is no longer needed). After all keys are processed our test framework checks

Re: No more disk space on a node of my Riak cluster

2013-03-07 Thread Mark Phillips
Salut Godefroy On Thu, Mar 7, 2013 at 6:50 AM, Godefroy de Compreignac wrote: > Hello, > > I'm running a cluster of 4 nodes (1,8 TB on each) and I have a problem of > balancing. Current data repartition is 18%, 19%, 30%, 34%. The node with 34% > of cluster data is completely full and doesn't want

question regarding fix of 'PB Search ignores "sort" parameter'

2013-03-07 Thread Harald Lapp
hi, i would like to ask if there is any timeline when this bug in "riak_search" gets fixed, see pull requests: https://github.com/basho/riak_search/pull/136 https://github.com/basho/riak_search/pull/137 for me it's no problem to apply the patches during installation of riak, still it would be ni

No more disk space on a node of my Riak cluster

2013-03-07 Thread Godefroy de Compreignac
Hello, I'm running a cluster of 4 nodes (1,8 TB on each) and I have a problem of balancing. Current data repartition is 18%, 19%, 30%, 34%. The node with 34% of cluster data is completely full and doesn't want to start anymoe. I don't know what to do. Do you have a solution for such a problem? Th

Re: after raising of n_val all keys exists multiple times in ?keys=true

2013-03-07 Thread Simon Effenberg
Now we see only 3 occurrences of the keys. So maybe the reducing of the n_val could be a problem.. after we removed the keys (or tried it) they were deleted and in the keys=keys output only 3 (was 4 before) are listed. So somehow the increasing n_val from 3 to 12 (factor 4...) and then reducing aga

Re: after raising of n_val all keys exists multiple times in ?keys=true

2013-03-07 Thread Simon Effenberg
Hi Mark, we have 12 Riak nodes running. The exact command for getting keys is: curl http://localhost:8098/buckets/config/keys?keys=true Properties are: curl -s http://localhost:8098/riak/config | python -mjson.tool { "props": { "allow_mult": true, "basic_quorum": false,

Re: using links or list of ids

2013-03-07 Thread Mikhail Tyamin
Hello, sorry that I did not clarify that but I am talking about using Riak's links and "Link walking". I don't know how it implemented under the hood, but it's still special case of MapReduce request, which could be (in most cases) slower than fetch by keys? We are using have's http client but

Re: using links or list of ids

2013-03-07 Thread Guido Medina
This is only an idea: You could mark your related objects with a 2i, when you need them retrieve the list of IDs from that 2i, fetch them concurrently into some local memory cache, and then navigate through your object graph by fetching them one by one from your local cache. It will be hard to

using links or list of ids

2013-03-07 Thread Mikhail Tyamin
Hello guys, I have some question about using a links. In our data model each object A has a lot of links to other objects and we are doing a lot of queries to fetch all objects that linked with A. Could it be a performance issue? Will it be more faster to store list of ids of related objects