Actually you get same error if you try to copy bitcask directory while
writing in it, so i assume any not completely-written bitcask file can
cause it. Easy way looks like dropping bitcask directory .
On Wed, Apr 18, 2012 at 2:26 PM, Nico Meyer <nico.me...@adition.com
<mailto:nico.me...@adition.com>> wrote:
Oh, I forgot to mention:
My workaround was to patch riak_kv_bitcask_backend to map all
errors to {error,not_found}. Which begs the question if the
'get/3' function of any backend should ever return anything other than
{ok, Value, State} and {error, not_found, State} if it isn't
handled by riak_kv_vnode.
BTW: I think the -spec() for get/3 is wrong both in
riak_kv_bitcask_backend and riak_kv_eleveldb_backend. It states a
possible return value of the form '{ok, not_found, state()}' for
the not_found case, instead of the actually returned form '{error,
not_found, state()}'
Cheers,
Nico
Am 18.04.2012 12:18, schrieb Nico Meyer:
Hello,
I just encountered a problem with one of our Riak nodes, which
is caused by a bug in either the disk controller or the
firmware of our SSD disks.
Anyway, the obvious symptom is, that all writes to the disks
suddenly fail, which of course leads to truncated bitcask
files. However, this time the files got corrupted in a way,
that lead to CRC errors while fetching keys from bitcask. This
in turn leads to a crash of the vnode everytime such a key is
read. So the log is filled with these messages:
11:55:52.621 [error] CRASH REPORT Process <0.23175.3> with 0
neighbours crashed with reason: no case clause matching
{error,bad_crc,{state,#Ref<0.0.0.196598>,"262613575457896618114724618378707105094425378816",[{async_folds,true},[{vnode_vclocks,false},{included_applications,[]},{allow_strfun,false},{reduce_js_vm_count,6},{storage_backend,riak_kv_bitcask_backend},{legacy_keylisting,false},{pb_ip,"0.0.0.0"},{hook_js_vm_count,2},{listkeys_backpressure,false},{mapred_name,"mapred"},{stats_urlpath,"stats"},{legacy_stats,true},{js_thread_stack,16},{riak_kv_stat,true},{add_paths,[]},{http_url_encoding,on},{map_js_vm_count,...},...],...],...}}
in riak_kv_vnode:prepare_put/3
Also those keys cannot be (over)written, since a put without
last_write_wins set to true does a get first internally.
I think the cause of the error should be obvious to anyone
familiar with the riak internals. Otherwise I can provide more
information.
Cheers,
Nico