Hi Bryan, 

thanks for your excellent answer, it helped a lot for a deeper knowledge of 
riak. 

You are right, I am using the default-one-node development cluster. After 
decreasing the ring size to 8 as mentioned by you, everything works fine.

I will post your answer unter my question at Stackoverflow, if it is ok for you?

Thanks a lot,
Cornelius

--
Cornelius Schmale
FH Wedel Gemeinnützige Schulgesellschaft mbH Feldstraße 143, 22880 Wedel
Tel.: +49 4103 8048-736
E-Mail: s...@fh-wedel.de
Web: http://www.fh-wedel.de/

Sitz der Gesellschaft: Wedel
Registergericht: Amtsgericht Pinneberg HRB 1578
Geschäftsführung: Prof. Dr. Eike Harms


-----Ursprüngliche Nachricht-----
Von: Bryan Fink [mailto:br...@basho.com] 
Gesendet: Montag, 19. November 2012 16:07
An: Cornelius Schmale
Cc: riak-users@lists.basho.com
Betreff: Re: Pipe worker startup failed:fitting was gone before startup

On Mon, Nov 12, 2012 at 12:04 PM, Cornelius Schmale <s...@fh-wedel.de> wrote:
> I have some problems using riak and mapreduce queries. I’ve described 
> the hole problem at
>
> http://stackoverflow.com/questions/13345448/riak-fails-at-mapreduce-qu
> eries-which-configuration-to-use

Hi, Cornelius. Could you describe a bit, your Riak configuration?
Specifically, how many nodes are in your cluster, and what is the 
ring_creation_size from you app.config?

If, for example, you're using a default setup {ring_creation_size, 64} on a 
one-node development cluster, this behavior is quite likely.
155,000 items is enough to get all 64 vnodes working.

In the first case, before raising map_js_vm_count, those 64 vnodes are fighting 
over just 8 Javascript VMs, and so some are likely to be starved long enough to 
time out, which will cause the "All VMs are busy" log message.

In the second case, after raising map_js_vm_count, it's likely that those 36 
Javascript VMs just aren't able to process all 155,000 items before the query 
timeout arrives. The "fitting was gone before startup" log message is saying 
that pipe running the query shut down while there were still inputs arriving at 
vnodes.

You're not seeing either of these behaviors in the simple case with no map 
function because no interaction with Javascript VMs is required.
In addition, for that case, objects are not even read off of disk, further 
alleviating resource contention.

The two configuration solutions I expect will help the most are lowering 
ring_creation_size, and raising the query timeout. Lowering ring_creation_size 
to 16, or even 8 on a single-node cluster will cause less contention for 
Javascript VMs because there will be less attempted parallelism in the map 
function processing. Raising the query timeout (should be an argument to the 
'run' function, or similar, but I'm not familiar with the riak-js client), will 
give the query more time to finish before shutting down, which may be necessary 
to overcome slow processing.

Rewriting your map function in Erlang should also help, since it will be 
faster, and will not have the same sort of VM contention. But, I understand, 
that's not as easy to using in early-stage development.

HTH,
Bryan



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to