Riak Recap for November 6 - 14

2012-11-14 Thread Mark Phillips
Morning, Afternoon, Evening To All - Here's a Recap for the last week or so: meetups, code, slides, and more. Enjoy, and thanks for being a part of Riak. Mark twitter.com/pharkmillups Riak Recap for November 6 - 14 1) Telefonica is

Re: timeout error for size>40k & changing q_limit has no affect

2012-11-14 Thread Venki Yedidha
Thanks Mark. Eagerly waiting for riak 1.3 Venkatesh On Wed, Nov 14, 2012 at 11:37 AM, Mark Phillips wrote: > Hi Venki, > > I know this email is about two months late, but I figured it was worth > responding to as I've had this response as a draft for the last 60 or > days or so. :) > > At

Re: mysterious Riak problems

2012-11-14 Thread David Lowell
I've quieted down most other Riak traffic on this system, and we still get this crashy behavior even under light load. Right now, it's 100% reproducible that if I run 'kv.bucket("ctv_tvdata").get_keys()' on this host, Riak logs hundreds of errors about leveldb workers crashing. We don't see lev

Re: Riak 1.2.1 memory usage

2012-11-14 Thread Matthew Von-Maszewski
See: http://docs.basho.com/riak/latest/tutorials/choosing-a-backend/LevelDB/ Look for the section titled "Parameter Planning". It has the best content. Keep in mind that leveldb maps most of its files into memory. So the RSS (resident set size) is BOTH the memory you allocate via parameters a

Re: mysterious Riak problems

2012-11-14 Thread David Lowell
Hi Matthew, I would like to understand these leveldb stalls, for sure. However, I wonder if they are a separate issue? I've been sitting here with a live tail of all the leveldb LOG files, along with the riak console log tailing side by side. And I've seen three periods whereby we get hundreds

Re: mysterious Riak problems

2012-11-14 Thread David Lowell
The 512 vnodes will run on 5 physical nodes in production, but we're running all 512 on a single node in dev. And it's on one of these single node "clusters" that we're seeing these issues. Dave -- Dave Lowell d...@connectv.com On Nov 14, 2012, at 1:35 PM, Matthew Von-Maszewski wrote: > Dave,

Re: mysterious Riak problems

2012-11-14 Thread Matthew Von-Maszewski
Dave, The problem seems most pronounced when your are averaging 6 megabyte values. Honestly, my previous test suite only included 150k values. The WriteThrottle is NOT giving you the support you need in this situation (you need it to be throttling a tad more). I need to think on how to help

Re: mysterious Riak problems

2012-11-14 Thread David Lowell
Thanks Matthew. Yep, there are quite a few hits on 'waiting'. Interesting. I'll send the merged log separately. Dave -- Dave Lowell d...@connectv.com On Nov 14, 2012, at 10:43 AM, Matthew Von-Maszewski wrote: > Dave, > > Ok, heavy writes. Let's see if leveldb has hit one of its intentional

Riak 1.2.1 memory usage

2012-11-14 Thread David Lowell
What major factors drive memory usage in Riak? As users of Riak 1.2.1 with eleveldb back-end, I had figured that the dominant source of memory use would be the eleveldb cache. However, we're using the default 8 MB of cache per vnode, which would therefore max out at 4 GB of cache for our 512 vno

Re: mysterious Riak problems

2012-11-14 Thread Matthew Von-Maszewski
Dave, Ok, heavy writes. Let's see if leveldb has hit one of its intentional "stalls": > sort /var/db/riak/leveldb/*/LOG* | grep -i waiting See if that shows any indication of stall in the LOG files of leveldb. If so, pick one server and send me a combined LOG file from that server: sort /var

Re: mysterious Riak problems

2012-11-14 Thread David Lowell
Thanks Matthew. I've run both greps with no hits, unfortunately. A couple of details that I want to highlight. Since I first posted about this issue, we upgraded from Riak 1.2.0, to 1.2.1. Following that upgrade, we continue see these periods of instability with errors in the logs like "riak_k

Re: Key removal using Bitcask and expiry_secs

2012-11-14 Thread Pavan Venkatesh
Hi Scott, To further add to this conversation, the key is removed immediately when there is a get request on that expired key, as you indicated in the email. Pavan On 11/13/12 7:38 AM, "Sean Cribbs" wrote: >Hi again Scott, > >Expired keys in bitcask will be removed from disk when >compaction/m

MapReduce queries fail while node is starting

2012-11-14 Thread Nico Meyer
Hi, since we upgraded our riak cluster from 0.14 to 1.2, we see MapReduce queries failing while any of the nodes is starting but not yet ready (that is while the 'Waiting for service riak_kv' message still appears in the logs). This is quite problematic, since it takes almost 50 minutes for t

Re: mysterious Riak problems

2012-11-14 Thread Matthew Von-Maszewski
Dave, Just getting my head back into the game. Was away for a few days. Random thought, maybe there is a hard drive with a read problem. That can cause issues similar to this. 1.2.1 does NOT percolate the read errors seen in leveldb to riak-admin (yes, that should start to happen in 1.3).

Re: mysterious Riak problems

2012-11-14 Thread Alexander Nilsson
> > Hmm. Thanks for the extensive info. We're looking into this. Give us a few > hours to do some theorizing. You'll hear from us tomorrow unless another > enterprising list member has thoughts here. > Was there any update to this? We had the exact same issue with one of the nodes in our 35

Re: More Migration Questions

2012-11-14 Thread Martin Woods
I'd still be interested in an answer from the Basho folk regarding the product questions. Putting to one side the "how" for now, *should* Riak be able to satisfy any or all those three scenarios? Regards, Martin. On 13 November 2012 23:34, Shane McEwan wrote: > G'day Tom and Matt. Thanks for yo