that's a hex number or 2147483648 decimal

Chad DePue - development consulting | skype cdepue | @chaddepue
+1 206.866.5707

On Mon, Jun 13, 2011 at 7:08 PM, Steve Webb <> wrote:

> Dan -
> Q: What does the syntax: 16#80000000 represent in the max_file_size
> parameter?  It's supposed to be 2GB, but I can't see where that means 2GB
> anywhere.
> Even if that meant 16 files of 80MB each, that only comes out to slightly
> over 1GB.
> - Steve
> --
> Steve Webb - Senior System Administrator for
> On Mon, 13 Jun 2011, Dan Reverri wrote:
>  Hi Steve,
>> The article points out that the active data file is not considered during
>> merge checks. Your 250-ish MB data file is the active file and not
>> considered during the merge check. The file will eventually role over to a
>> non-active file when it hits 2 GB in size. Once the file is not active it
>> will be considered during the merge check and merging will take place.
>> The 2 GB file size is configurable via the max_file_size parameter:
>> Thanks,
>> Dan
>> Daniel Reverri
>> Developer Advocate
>> Basho Technologies, Inc.
>> On Mon, Jun 13, 2011 at 2:38 PM, Steve Webb <> wrote:
>>  Dan -
>>> I've got dead_bytes_threshold=5242880 (5M) and
>>> dead_bytes_merge_trigger=10242880.  My bitcask *.data files are 250-ish
>>> MB
>>> in size:
>>> root@ha2
>>> :/data/riaksearch/bitcask/1027618338748291114361965898003636498195577569280#
>>> ls -lah
>>> total 771M
>>> drwxr-xr-x  2 riak riak 4.0K 2011-06-12 01:08 .
>>> drwxr-xr-x 34 riak riak 4.0K 2011-06-12 01:10 ..
>>> -rw-------  1 riak riak 229M 2011-06-08 13:11
>>> -rw-r--r--  1 riak riak 4.3M 2011-06-08 13:11 1307415077.bitcask.hint
>>> -rw-------  1 riak riak 276M 2011-06-10 13:30
>>> -rw-r--r--  1 riak riak 5.1M 2011-06-10 13:30 1307562153.bitcask.hint
>>> -rw-------  1 riak riak 1.4M 2011-06-08 13:45
>>> -rw-r--r--  1 riak riak  27K 2011-06-08 13:45 1307562333.bitcask.hint
>>> -rw-------  1 riak riak 246M 2011-06-13 15:34
>>> -rw-r--r--  1 riak riak 9.4M 2011-06-13 15:34 1307862506.bitcask.hint
>>> -rw-------  1 riak riak  107 2011-06-12 01:08 bitcask.write.lock
>>> I'm pretty sure that 50% or more of the data in these files should've
>>> aged-off by now and the merge trigger should've happened.  The article
>>> shows
>>> why merges happen when a restart is done, but it doesn't really explain
>>> why
>>> merges don't happen at normal runtime.
>>> I really don't want to restart riak every day to merge files.
>>> Q: What are some good trigger settings for my use case?
>>> I want to collect and store 1 day worth of tweets from the twitter
>>> spritzer
>>> feed and have the data files auto-merge once in a while (once a day or
>>> more
>>> frequently) when they've gotten 10% of 'dead' data in them (aka, the
>>> tweets
>>> expire after 1 day).
>>> - Steve
>>> --
>>> Steve Webb - Senior System Administrator for
>>> On Mon, 13 Jun 2011, Dan Reverri wrote:
>>>  Hi Steve,
>>>> This Knowledge Base article may be related:
>>>> Thanks,
>>>> Dan
>>>> Daniel Reverri
>>>> Developer Advocate
>>>> Basho Technologies, Inc.
>>>> On Mon, Jun 13, 2011 at 10:25 AM, Steve Webb <> wrote:
>>>>  Justin -
>>>>> My current bitcask settings are:
>>>>>  %% Bitcask Config
>>>>>  {bitcask, [
>>>>>           {data_root, "/var/lib/riaksearch/bitcask" },
>>>>>           {dead_bytes_merge_trigger, 10242880 },
>>>>>           {dead_bytes_threshold, 5242880 },
>>>>>           {expiry_secs, 86400}
>>>>>         ]},
>>>>> My understanding of these settings mean that the data should
>>>>> auto-expire
>>>>> after one day.  Also, once each bitcask file in
>>>>> .../riaksearch/bitcask/xxx/*.data once it has 10M of "dead" or expired
>>>>> data
>>>>> in it, should be merged, right?
>>>>> I'm collecting the spritzer twitter stream and loading it into two
>>>>> buckets
>>>>> (one non-indexed bucket holds the full tweet, one indexed bucket holds
>>>>> the
>>>>> tweet string, id, date and username).  I used to see about 10 GB of
>>>>> data
>>>>> total, but it's growing and currently at 26GB of data total.
>>>>> I'm seeing these in the logs:
>>>>> INFO REPORT==== 13-Jun-2011::08:28:19 ===
>>>>> Pid <0.6844.0> compacted 3 segments for 942232 bytes in 4.900694
>>>>> seconds,
>>>>> 0.18 MB/sec
>>>>> =INFO REPORT==== 13-Jun-2011::08:29:01 ===
>>>>> Pid <0.6267.0> compacted 3 segments for 1721790 bytes in 9.690511
>>>>> seconds,
>>>>> 0.17 MB/sec
>>>>> =INFO REPORT==== 13-Jun-2011::08:31:23 ===
>>>>> Pid <0.6924.0> compacted 3 segments for 6988416 bytes in 44.659753
>>>>> seconds,
>>>>> 0.15 MB/sec
>>>>> ... but I'm not seeing any "merging" related entries.
>>>>> - Steve
>>>>> --
>>>>> Steve Webb - Senior System Administrator for
>>>>> On Wed, 8 Jun 2011, Justin Sheehy wrote:
>>>>>  Hi, Steve.
>>>>>> Check out this page:
>>>>>> Basically, a "merge trigger" must be met in order to have the merge
>>>>>> process occur.  When it does occur, it will affect all existing files
>>>>>> that
>>>>>> meet a "merge threshold."
>>>>>> One note that is relevant for your specific use: the expiry_secs
>>>>>> parameter
>>>>>> will cause a given item to disappear from the client API immediately
>>>>>> after
>>>>>> expiry, and to be cleaned if it is in a file already being merged, but
>>>>>> will
>>>>>> not currently contribute toward merge triggers or thresholds on its
>>>>>> own
>>>>>> if
>>>>>> not otherwise "dead".
>>>>>> -Justin
>>>>>> On Jun 7, 2011, at 4:29 PM, Steve Webb wrote:
>>>>>>  Hello there.
>>>>>>> I'm curious - I'm up to about 10GB of storage and I'm guessing that
>>>>>>> I'll
>>>>>>> be full in 3-4 more days of ingesting data.  I have no idea if/when a
>>>>>>> merge
>>>>>>> will run to expire the older data.
>>>>>>> I'm loading a 2-node (1GB mem, 20GB storage, vmware VMs) riaksearch
>>>>>>> cluster with the spritzer twitter feed.  I used the bitcask
>>>>>>> 'expiry_secs' to
>>>>>>> expire data after 3 days. Q: Is there a method or command to force a
>>>>>>> merge
>>>>>>> at any time? Q: Is there a way to run a merge when the storage size
>>>>>>> reaches
>>>>>>> a specific threshold?
>>>>>>> - Steve
>>>>>>> --
>>>>>>> Steve Webb - Senior System Administrator for
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>  _______________________________________________
>>>>> riak-users mailing list
> _______________________________________________
> riak-users mailing list
riak-users mailing list

Reply via email to