The mr_queue is a bitcask, so you should expect it to grow monotonically until compaction. The file size is not an indication of the number of pending jobs. You can read the contents using any bitcask utility. For example, using https://github.com/aphyr/bitcask-ruby:

$ bitcask --no-riak /var/lib/riak/mr_queue/ all
...
3264
bitcask_tombstone
3265
bitcask_tombstone
3266
bitcask_tombstone
3267
bitcask_tombstone

--Kyle

On 06/30/2011 01:52 PM, Sylvain Niles wrote:
So I backrev'd everything to: Erlang R13B04, Riak 0.14.2 (no riak
search) and got rid of any functionality using search. After importing
the 7k objects the bitcask dir is ~41MB. Starting up our app
everything works fine until a worker starts updating objects with new
values at the rate of about 1-2/second. It finds those objects via a
map function that looks for one json integer and compares it with an
input (currently with a javascript function but I'm slowly porting
them all to erlang). While this worker is running the mr_queue
directory grows at about 1MB every 2 minutes, forever. It's my
understanding that pending m/r jobs are persisted to disk in this
directory, but the amount of work is trivial and the mr_queue never
gets smaller even after we shut down all our workers and leave riak
alone.

Is there a way to list the m/r jobs in the queue in case there's
something else going on? Is there a reason they never get removed?

Thanks in advance,
Sylvain


On Wed, Jun 29, 2011 at 12:59 AM, Mathias Meyer<math...@basho.com>  wrote:
Sylvain,

you should not be using riak HEAD for anything that's close to your production 
environment. Development in Riak is in big flux right now, and it'll be hard 
for us to help you find the specific problem.

Could you please install Riak Search 0.14.2, best with a fresh installation, 
and try running this setup again to see if you get the same erroneous results? 
If you do, some more details on your data and the MapReduce jobs you're running 
would be great to reproduce and figure out the problem.

Mathias Meyer
Developer Advocate, Basho Technologies


On Mittwoch, 29. Juni 2011 at 00:41, Sylvain Niles wrote:

We recently started trying to move our production environment over to
riak and we're seeing some weird behavior that's preventing us from
doing so:

Running: riak HEAD from github as of last Friday, riak_search turned
on with indexing of the problem bucket "events".
When we turn on our processes that start creating objects in the
bucket "events", the mr_queue directory starts growing massively and
the riak process starts spending most of its time in io_wait. With a
total of about 7000 objects (each is a ripple document that's got
maybe 10-20 lines of text in it) in the events bucket out bitcask dir
was ~240MB and the mr_queue dir was 1.2GB. Something is going horribly
wrong.. Our logs only show flow_timeouts for normal requests trying to
do simple map/reduce lookups. Any ideas where to look?

Thanks in advance,
Sylvain

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to