Hi Bogunov!

Simple truncation of the bitcask files won't trigger this error, since bitcask will notice that the last written entry is truncated and ignore it. In this case a 'not found' is returned to the layer above bitcask. If on the other hand, an entry (not necessarily the last one written) has the right length but the checksum that bitcask writes with each entry does not match this error is returned as such. The layer above bitcask (riak_kv_vnode) doesn't handle this case, and therefore chrashes. Of course a checksum error in the middle of the file means the file is corrupted. But if the only way to resolve the problem, is to delete the whole file, bitcask might a well pretend the key was not found (an maybe delete it internally). That way at least the rest of the file might be still usable.

I think what happend in my case, is that the file had the right length to fully contain the last entry, but the data was not fully written. This is what you get and rightly deserve for using ext4 as the filesystem :-(.

But still I would think chrashing the vnode if the bitcask files are corrupted is always the wrong behaviour. At the very least an error should be returned to the node performing the get, to fail fast in the case where R is set to N. Otherwise the request hangs until the timeout is reached, wich is 60 second by default.

Cheers,
Nico

Am 19.04.2012 11:19, schrieb Bogunov:
Actually you get same error if you try to copy bitcask directory while writing in it, so i assume any not completely-written bitcask file can cause it. Easy way looks like dropping bitcask directory .

On Wed, Apr 18, 2012 at 2:26 PM, Nico Meyer <nico.me...@adition.com <mailto:nico.me...@adition.com>> wrote:

    Oh, I forgot to mention:

    My workaround was to patch riak_kv_bitcask_backend to map all
    errors to {error,not_found}. Which begs the question if the
    'get/3' function of any backend should ever return anything other than
    {ok, Value, State} and {error, not_found, State} if it isn't
    handled by riak_kv_vnode.

    BTW: I think the -spec() for get/3 is wrong both in
    riak_kv_bitcask_backend and riak_kv_eleveldb_backend. It states a
    possible return value of the form '{ok, not_found, state()}' for
    the not_found case, instead of the actually returned form '{error,
    not_found, state()}'

    Cheers,
    Nico

    Am 18.04.2012 12:18, schrieb Nico Meyer:

        Hello,

        I just encountered a problem with one of our Riak nodes, which
        is caused by a bug in either the disk controller or the
        firmware of our SSD disks.
        Anyway, the obvious symptom is, that all writes to the disks
        suddenly fail, which of course leads to truncated bitcask
        files. However, this time the files got corrupted in a way,
        that lead to CRC errors while fetching keys from bitcask. This
        in turn leads to a crash of the vnode everytime such a key is
        read. So the log is filled with these messages:

        11:55:52.621 [error] CRASH REPORT Process <0.23175.3> with 0
        neighbours crashed with reason: no case clause matching
        
{error,bad_crc,{state,#Ref<0.0.0.196598>,"262613575457896618114724618378707105094425378816",[{async_folds,true},[{vnode_vclocks,false},{included_applications,[]},{allow_strfun,false},{reduce_js_vm_count,6},{storage_backend,riak_kv_bitcask_backend},{legacy_keylisting,false},{pb_ip,"0.0.0.0"},{hook_js_vm_count,2},{listkeys_backpressure,false},{mapred_name,"mapred"},{stats_urlpath,"stats"},{legacy_stats,true},{js_thread_stack,16},{riak_kv_stat,true},{add_paths,[]},{http_url_encoding,on},{map_js_vm_count,...},...],...],...}}
        in riak_kv_vnode:prepare_put/3

        Also those keys cannot be (over)written, since a put without
        last_write_wins set to true does a get first internally.
        I think the cause of the error should be obvious to anyone
        familiar with the riak internals. Otherwise I can provide more
        information.

        Cheers,
        Nico



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to