Re: Range Loop Timeout Error after (after disk space over limit)

Ryan Zezeski Wed, 18 Jul 2012 11:37:07 -0700

The `badfun` is a new error.  That wasn't in your original email.  I'm not
sure why you are seeing that.  Are all your Riak nodes using 1.2.0-rc1?
 Can you give me more information on your cluster setup?  Are there any
other errors in you logs?  The more information the more I can help.


The repair "command" is not actually available from the command line yet.
 You need to attach to the Riak console to access it.  The APIs are
`riak_kv_vnode:repair(PartitionNumber)` and
`riak_search_vnode:repair(PartitionNumber)`.


On Wed, Jul 18, 2012 at 1:02 PM, Arnaud Wetzel <arnaud.wet...@gmail.com>wrote:

> Ryan,
> Increasing "ulimit -n" (current value is 4096, I have tested from 1024 to
> 200000) does not change anything, always the same errors :
> {timeout,range_loop}
> lookup/range failure:
> {{badfun,#Fun<riak_search_client.9.8393097>},[{mi_server,iterate,6},{mi_server,lookup,8}]}
>
> I cannot find the command "repair" that you talk about in your email (on
> riak1.2.0-rc1), is it a function directly in an erlang module and not
> accessible yet with riak-admin ?
>
> Thank you very much.
>
> --
> Arnaud Wetzel
> KBRW Ad-Venture
> 13 rue st Anastase, 75003 Paris
>
> 2012/7/16 Ryan Zezeski <rzeze...@basho.com>
>
>> Arnaud,
>>
>> The 'stream_timeout' and 'emfile' should be correlated.  Whenever you see
>> the 'emfile' you should see a corresponding timeout.  The index server
>> errors causing the result collector to timeout later.  First, adjust your
>> file descriptor limit and then go from there.
>>
>> For the 1.2 release a "repair" command has been added to rebuild KV or
>> index data for a given partition.  In releases before that you must reindex
>> all your data.  You don't have to worry about removing the current indexes
>> as merge index will garbage collect that for you as it merges.  As I said,
>> first I would fix the 'emfile' issue and then see if further action is
>> needed.
>>
>> -Z
>>
>> P.S. If you want to be absolutely sure what your FD limit is in Riak you
>> can `riak attach` and then `os:cmd("ulimit -n").`  Make sure to use Ctrl-D
>> to exit from the Riak shell.
>>
>> On Mon, Jul 16, 2012 at 5:21 AM, Arnaud Wetzel 
>> <arnaud.wet...@gmail.com>wrote:
>>
>>> Hi,
>>> Friday evening one of our riak node has reach his disk space limit
>>> during indexing in riak-search. Then after adding some nodes, some requests
>>> fail, and it is impossible to find the correlation between requests with
>>> error or those who succeed.
>>> The errors are :
>>>
>>> {{nocatch,stream_timeout},[{riak_search_op_utils,gather_stream_results,4}]}
>>> {timeout,range_loop}
>>>
>>> and sometimes (not always) :
>>>
>>> {{badmatch,{error,emfile}},[{mi_segment,iterate_by_keyinfo,7},{mi_server,'-lookup/8-lc$^1/1-1-',4},{mi_server,'-lookup/8-lc$^1/1-1-',4},{mi_server,lookup,8}]}
>>>
>>> So anyone else has experienced these errors ? Is it possible that they
>>> come from the disk over limit error ? How can I try to repair merge index
>>> data ? If it is not possible, what is the good process to delete entirely
>>> all the indexes (only indexes, keeping riak datas).
>>>
>>> Thank you very much.
>>>
>>> Regards.
>>>
>>> --
>>> Arnaud Wetzel
>>> KBRW Ad-Venture
>>> 13 rue st Anastase, 75003 Paris
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Range Loop Timeout Error after (after disk space over limit)

Reply via email to