Bitcask is a write only log (wol) that eats disk (by keeping all updates) until a compaction phase that reclaims disk at some defined interval.

-Alexander


@siculars on twitter
http://siculars.posterous.com

Sent from my iPhone

On Aug 17, 2010, at 11:27, Dmitry Demeshchuk <demeshc...@gmail.com> wrote:

Greetings.

This problem has already been discussed in IRC a bit.

I use Riak 0.12.1 (have been using 0.12.0 but then updated to the
latest version and got the same problem) with bitcask storage.

All Riak settings are default, i.e., all buckets are
default-configured (allow_mult=false), replication is 3x. Currently
Riak is run at a single machine. This problem is reproduced on
different machines with different Riak clusters brought up.

Though the total database records size doesn't grow, update operations
(I'll describe them in details later) make the total size of the
"data/bitcask" folder. For example, I made a database backup on our
test server and the backup size was 2.5MB. But the size of the
"data/bitcask" folder was 17GB!

Careful investigation showed that the entire database size on the disk
is performed when Riak update operation is performed, even when the
value during update was exactly the same.

The update operation is like this:

RiakObject = RiakClient:get(Bucket, Key, 1),
OldValue = riak_object:get_value(RiakObject),
NewValue = do_something(),
NewRiakObject = riak_object:update_value(RiakObject, NewValue),
RiakClient:put(NewRiakObject, 1).

And it appeared that even if I make NewValue exactly the same as
OldValue, this update operation increases the database size of the
disk. Still, the entire size of this Riak object is the same.

I thought that maybe I could do something wrong with data operating,
and there's some data I miss. But, again, backup file is very small,
much smaller then the disk space occupied by database.

If I do list_buckets or list_keys, these operations work desperately
slow but finally they return the right values, without any garbage.
Values of the Riak objects are okay as well.

When I had a look at data files, it appeared that *.bitcask.data are
the files that keep growing.

That's all I found for now.

Any clues?

--
Best regards,
Dmitry Demeshchuk

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to