On Wed, Feb 22, 2012 at 11:26 AM, Matthew A. Brown
<mat.a.br...@gmail.com> wrote:
> Hi all,
>
> We're seeing a new timeout error after upgrading our cluster to Riak
> 1.1. The error message:
>
> {"phase":0,"error":"[timeout]","input":"{{<<\"service_profiles\">>,<<\"8fh/2\">>},{struct,[{<<\"key\">>,<<\"s-10925\">>}]}}","type":"forward_preflist","stack":"[]"}

Hi, Matthew.  This is, indeed, a bit of a bug.  I've filed an issue:
https://github.com/basho/riak_kv/issues/290

The general problem is that the inputs to the fetch half of your map
phase are outrunning the rate at which it can pump outputs to the
processing half.  This means that the queues for the fetchers get
backed up, and that leaves no place for retry requests to go when
fetchers run into not-founds and the like.  It ends up causing errors
(which are internal timeouts of a sort, unrelated to the timeout
you've set), which causes the MapReduce endpoint to kill the query.

If you're willing to recompile riak, you can modify the `q_limit`
fields in the pipe spec I mentioned in `riak_kv_mrc_pipe`.  Otherwise,
the most likely fix (until we resolve the issue) is to make the map
function (and anything else downstream) as fast as possible so it will
keep up.

Hope that helps,
Bryan

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to