Hi Charl,

The problem is that even though documents seem to no longer be 
available (doing a GET on a deleted document returns an expected 404) 
the disk usage is not seeming reducing much and has currently been at 
~80% utilisation across all nodes for almost a week. 
When you delete a document, a tombstone record is written to bitcask, and the 
reference to the key is removed from memory (which is why you get 404's).  The 
old entry isn't actually removed until the next bitcask merge.

At first I though the large amount of deletes being performed might be 
causing fragmentation of the merge index so I've been regularly 
running forced compaction as documented here: 
https://gist.github.com/rzezeski/3996286. 
That merge index is for Riak Search, not bitcask.

There are ways of forcing a merge, but let's double check your settings/logs 
first. Can you send me your app.config and a console.log from one of your nodes?

Thanks,
Alex 

 -- 
Alex Moore
Sent with Airmail

On September 16, 2013 at 4:43:07 AM, Charl Matthee (ch...@ntrippy.net) wrote:

Hi, 

We have a 8-node riak v1.4.0 cluster writing data to bitcask backends. 

We've recently started running out of disk across all nodes and so 
implemented a 30-day sliding window data retention policy. This policy 
is enforced by a go app that concurrently deletes documents outside 
the window. 

The problem is that even though documents seem to no longer be 
available (doing a GET on a deleted document returns an expected 404) 
the disk usage is not seeming reducing much and has currently been at 
~80% utilisation across all nodes for almost a week. 

At first I though the large amount of deletes being performed might be 
causing fragmentation of the merge index so I've been regularly 
running forced compaction as documented here: 
https://gist.github.com/rzezeski/3996286. 

This has helped somewhat but I suspect it has reached the limits of 
what can be done so I wonder if there is not further fragmentation 
elsewhere that is not being compacted. 

Could this be an issue? How can I tell whether merge indexes or 
something else needs compaction/attention? 

Our nodes were initially configured to run with the default settings 
for the bitcask backend but when this all started I switched to the 
following to try and see if I can trigger compaction more frequently: 

{bitcask, [ 
%% Configure how Bitcask writes data to disk. 
%% erlang: Erlang's built-in file API 
%% nif: Direct calls to the POSIX C API 
%% 
%% The NIF mode provides higher throughput for certain 
%% workloads, but has the potential to negatively impact 
%% the Erlang VM, leading to higher worst-case latencies 
%% and possible throughput collapse. 
{io_mode, erlang}, 

{data_root, "/var/lib/riak/bitcask"}, 

{frag_merge_trigger, 40}, %% trigger merge if 
framentation is > 40% default is 60% 
{dead_bytes_merge_trigger, 67108864}, %% trigger if dead 
bytes for keys > 64MB default is 512MB 
{frag_threshold, 20}, %% framentation >= 20% default is 40 
{dead_bytes_threshold, 67108864} %% trigger if dead bytes 
for data > 64MB default is 128MB 
]}, 

From my observations this change did not make much of a difference. 

The data we're inserting is hierarchical JSON data that roughly falls 
into the following size (in bytes) profile: 

Max: 10320 
Min: 1981 
Avg: 3707 
Med: 2905 

-- 
Ciao 

Charl 

"I will either find a way, or make one." -- Hannibal 

_______________________________________________ 
riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to