Great, this explains why bitcask went merging crazy after we updated
from 0.14 to 1.1! The workaround was to constrain merging to the
nighttime, and since it kind of worked(TM) since then, I never came
round to look for the reason for the increased I/O load after the update.
I just checked, and this behaviour was introduced in 1.0.3, but nothing
was mentioned in the release notes :-(. In older versions, key
expiration really had no impact on triggering merges. This might cause
problems on its own. If say all you ever do is adding new keys, merging
is never triggered, which is bad if you are recording time series for
examples.
I assume, that this is the reason Basho changed the merge triggering
logic. No disrespect, but the current solution (including the proposed
fix) is just wrong. At least if there is a continous, evenly spaced (in
time) stream of expiring keys, as should be the case in most scenarios
where one would use key expiration, it amounts to triggering a merge
each time, just like setting frag_merge_trigger to zero.
The proposed fix will not improve anything, if we are talking about the
expiry_grace_time option
(https://github.com/basho/bitcask/blob/master/src/bitcask.erl#L569). It
just shift time, which is all the same, if you have an infinite stream
of expiring keys.
At the very least, a freshly merged bitcask file should be exempt from
being merged for a certain time, to limit the rate of merging the same keys.
I believe the real solution has to involve some form of counting the
number of expired keys for each file, and then adding those count to the
number of deleted/outdated keys.
If scanning the key dir in memory periodically is to expensive, it is
easy enough to do it approximately by storing a histogram of timestamps
for each file. At least if the expiration time is large compared to the
time between file rotations, this should be fairly accurate, and the
accuracy vs. CPU/mem can be easily tuned by changing the size of the
histogram. But talk is cheap. I will have to look into this anyway, now
that I know about it, so hopefully a patch might fall out at the end.
Cheers,
Nico
Am 26.09.2012 16:58, schrieb Anthony Molinaro:
One thing to note about expiration and bitcask, which unfortunately has been
causing us much consternation is that key expiration is a silent merge
trigger. I guess what happens is if a key expires that somehow causes
riak to want to merge that bitcask. If you have keys constantly expiring
(which most people probably do, as the reason you want expiry as you have
data that you no longer want after 30 or 90 days), this will cause constant
merging. That expiration has anything to do with merging is not documented.
We worked with Basho support and have a patch which we will be trying soon
which should help but mostly by delaying merging (as opposed to only
expiring on merge and not using expiration as a trigger).
But if you are using expiry and bitcask and notice that for some reason
you are constantly merging small bitcasks despite tuning triggers and
file sizes, then you are probably running into this case.
Anyway, just an FYI as I believe others have complained about constant
merging on this list before,
-Anthony
On Thu, Sep 20, 2012 at 03:44:22PM -0600, David Smith wrote:
Sorry -- there is not currently an event that notifies external
systems when keys are being expired. The expiration is done at merge
time, so you could potentially get a very large number of events all
at once.
D.
On Thu, Sep 20, 2012 at 12:14 PM, Mike Oxford <moxf...@gmail.com> wrote:
When a key comes up on expiration, I would like to know about it.
Is this supported anywhere in the Bitcask backend (like {expiry_secs,N}), by
any chance?
Thanks!
-mox
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
Dave Smith
VP, Engineering
Basho Technologies, Inc.
diz...@basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com