Unit testing persistence
Hi all ~ Great meetup today - looking forward to upgrading to 1.4 I had a question Mark suggested posting here, then we discussed with a few other folks too: How do we unit / integration test persistence with riak? Given a basic dev environment, e.g. running only one riak physical node locally with all configs default, how do we surely read the data we just wrote? I have tried setting DW=all (Durable Write - as recommended for best consistency in the financial example from the little riak book - section for developers, more than N/R/W) and tried also using {delete_mode, keep} in riak_kv app.config (since I truncate the buckets after each test suite), but still, I get intermittent test failures as eventually the data isn't available for reading right after writing. Please note I'm trying to avoid mocking / stubbing as well as hacks like "keep trying to read" until a certain timeout. I'm looking ideally for a simple configuration or any known best practices. Thanks ~W ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Secondary indexes in ruby (using riak-ruby-client)
Hi, I'm trying to get a hello world example working as follows: require 'riak' client = Riak::Client.new bucket = client.bucket('revisions') object = Riak::RObject.new(bucket, 'foo').tap do |o| o.content_type = 'application/json' o.data = '' end object.indexes[:bars_bin] = %w(foo bar) object.store bucket.get_index 'bars_bin', 'foo' But am failing with: Zlib::DataError: incorrect header check from /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http/response.rb:357:in `finish' Any help would be appreciated. More error details below: -- Control frame information --- c:0024 p:0040 s:0122 e:000122 BLOCK /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:65 [FINISH] c:0023 p: s:0119 e:000118 CFUNC :each c:0022 p:0035 s:0116 e:000115 BLOCK /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:58 c:0021 p:0014 s:0113 e:000112 BLOCK /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1413 c:0020 p:0033 s:0111 e:000110 METHOD /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http/response.rb:162 c:0019 p:0096 s:0106 e:000105 BLOCK /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1412 [FINISH] c:0018 p: s:0104 e:000103 CFUNC :catch c:0017 p:0024 s:0100 e:99 METHOD /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1403 c:0016 p:0061 s:0093 e:92 METHOD /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1376 c:0015 p:0017 s:0086 E:002090 BLOCK /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:56 [FINISH] c:0014 p: s:0083 e:82 CFUNC :tap c:0013 p:0190 s:0080 E:001ba0 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:54 c:0012 p:0032 s:0070 e:69 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client/http_backend/transport_methods.rb:44 c:0011 p:0239 s:0063 e:62 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client/http_backend.rb:297 c:0010 p:0016 s:0053 e:52 BLOCK /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:266 c:0009 p:0010 s:0050 e:49 BLOCK /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:434 c:0008 p:0078 s:0046 E:001ac0 METHOD /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/gems/2.0.0/gems/innertube-1.0.2/lib/innertube.rb:127 c:0007 p:0050 s:0040 E:001a50 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:432 c:0006 p:0012 s:0032 e:31 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:285 c:0005 p:0035 s:0028 e:27 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:138 c:0004 p:0011 s:0024 E:001da8 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:265 c:0003 p:0020 s:0017 e:16 METHOD /crunchbase/open_source/riak-ruby-client/lib/riak/bucket.rb:179 c:0002 p:0088 s:0011 E:001930 EVAL bench/2i.rb:14 [FINISH] c:0001 p: s:0002 E:000228 TOP[FINISH] bench/2i.rb:14:in `' /crunchbase/open_source/riak-ruby-client/lib/riak/bucket.rb:179:in `get_index' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:265:in `get_index' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:138:in `backend' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:285:in `http' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:432:in `recover_from' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/gems/2.0.0/gems/innertube-1.0.2/lib/innertube.rb:127:in `take' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:434:in `block in recover_from' /crunchbase/open_source/riak-ruby-client/lib/riak/client.rb:266:in `block in get_index' /crunchbase/open_source/riak-ruby-client/lib/riak/client/http_backend.rb:297:in `get_index' /crunchbase/open_source/riak-ruby-client/lib/riak/client/http_backend/transport_methods.rb:44:in `get' /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:54:in `perform' /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:54:in `tap' /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:56:in `block in perform' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1376:in `request' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1403:in `transport_request' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1403:in `catch' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1412:in `block in transport_request' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http/response.rb:162:in `reading_body' /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http.rb:1413:in `block (2 levels) in transport_request' /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:58:in `block (2 levels) in perform' /crunchbase/open_source/riak-ruby-client/lib/riak/client/net_http_backend.rb:58:in `each' /cr
Re: Secondary indexes in ruby (using riak-ruby-client)
Yeah that error wasn't helping much, and you're right - I was using bitcask, and now it's working with LevelDB. Thanks! On Tue, Sep 17, 2013 at 11:51 PM, Charl Matthee wrote: > Hi, > > On 17 September 2013 23:43, Wagner Camarao wrote: > > > bucket.get_index 'bars_bin', 'foo' > > > > But am failing with: > > > > Zlib::DataError: incorrect header check > > from > > > /Users/wagner/.rbenv/versions/2.0.0-p195/lib/ruby/2.0.0/net/http/response.rb:357:in > > `finish' > > I think the Zlib error is obscuring what's really happening in the > background. > > What backend are you using? > > If it is bitcask then this will not work and you need to switch to one > that supports 2I, like levelDB: > > > https://github.com/basho/riak-ruby-client/wiki/Secondary-Indexes#how-secondary-indexes-aka-2i-work > > > -- > Ciao > > Charl > > "I will either find a way, or make one." -- Hannibal > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
2i at large scale?
Hi, I'm benchmarking 2i at scale of billion records, running one physical node locally with mostly default configs - except for LevelDB instead of Bitcask. Up to this point (14MM records in the bucket that's being indexed) it's still performing lookups well for my use case (read ~ 7ms using riak-ruby-client over http). However, along this process I've noticed riak to go down twice. First time (8MM records) I could just start it again and continue my benchmarking from the point it were left, but now at the second time (14MM records) when I started riak again, it took about 3 minutes to respond to my first request. What was happening during these long startup minutes, after my second crash? Up to which scale have you guys been successfully using secondary indexes? Any other ideas given my use case / benchmarking scenario? Thanks, Wagner ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: 2i at large scale?
Vincenzo / Jared, That makes sense and yes I'm planning on benchmarking on a real cluster although I don't have one setup yet. For now I'll adjust my local config on Jared's recommendation. Reid, Thanks for bringing that up and yes I'm using paginated 2i. Has anyone here stopped using 2i because it stopped scaling well at some point? If so, which point was that? Any quantifiable information would be helpful. I'm asking this again as I had a strong recommendation not to use 2i at a large scale, and instead use basic riak key value to implement such capabilities. Does anyone share a similar experience? Or a different one at all? Thanks, Wagner On Wed, Sep 25, 2013 at 12:04 PM, Reid Draper wrote: > Wagner, > > Are you using paginated 2i? Or asking for all of the results buffered at > once? If the latter, I'd highly recommend trying to new paginated 2i that > landed in Riak 1.4.0. > > Reid > > On Sep 25, 2013, at 10:50 AM, Wagner Camarao > wrote: > > > Hi, > > > > I'm benchmarking 2i at scale of billion records, running one physical > node locally with mostly default configs - except for LevelDB instead of > Bitcask. Up to this point (14MM records in the bucket that's being indexed) > it's still performing lookups well for my use case (read ~ 7ms using > riak-ruby-client over http). > > > > However, along this process I've noticed riak to go down twice. First > time (8MM records) I could just start it again and continue my benchmarking > from the point it were left, but now at the second time (14MM records) when > I started riak again, it took about 3 minutes to respond to my first > request. > > > > What was happening during these long startup minutes, after my second > crash? > > > > Up to which scale have you guys been successfully using secondary > indexes? > > > > Any other ideas given my use case / benchmarking scenario? > > > > Thanks, > > Wagner > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: 2i at large scale?
What size page are you using (ie. how many results per query)? A: 10 When you said the results would sometimes take three minutes, is that per page, or to paginate through all of the results? A: I don't observe any difference by paging the first or any next page. That wait time is only when I start riak and run the first request (whatever that request is, paging, key lookup, or just getting a reference to a bucket). Are you observing CPU and memory utilization while these stalls happen? A: Yes, I see both CPU and memory but nothing above 50% of use. If I do bucket.keys (yes, not recommended) then it's the only way I get my CPU usage close to 100%. I don't do that as part of my benchmarking. I did it a couple times just to see it happening, and actually none of these times took riak down. Do you have swap disabled? A: Nope. Why you would recommend doing so? Could there be any downsides? On Wed, Sep 25, 2013 at 2:18 PM, Reid Draper wrote: > > On Sep 25, 2013, at 3:39 PM, Wagner Camarao wrote: > > Reid, > Thanks for bringing that up and yes I'm using paginated 2i. > > > * What size page are you using (ie. how many results per query)? > * When you said the results would sometimes take three minutes, is that > per page, or to paginate through all of the results? > * Are you observing CPU and memory utilization while these stalls happen? > * Do you have swap disabled? > > Reid > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: 2i at large scale?
Hey Reid, Just wanted to say thanks for your last thoughts on my 2i thread. I am sorry I didn't reply on a good timing, I just realized that now. Best ~W On Wed, Sep 25, 2013 at 3:44 PM, Reid Draper wrote: > > On Sep 25, 2013, at 4:35 PM, Wagner Camarao wrote: > > What size page are you using (ie. how many results per query)? > > A: 10 > > > If I understood correctly, you're paginating through a few million results > total. If so, I'd try setting your page size much larger: try 1000. > > > > When you said the results would sometimes take three minutes, is that per > page, or to paginate through all of the results? > > A: I don't observe any difference by paging the first or any next page. > That wait time is only when I start riak and run the first request > (whatever that request is, paging, key lookup, or just getting a reference > to a bucket). > > > So performance levels out after the first couple of requests? If so, this > might be explained by the leveldb cache being built up. > > > > Are you observing CPU and memory utilization while these stalls happen? > > A: Yes, I see both CPU and memory but nothing above 50% of use. If I do > bucket.keys (yes, not recommended) then it's the only way I get my CPU > usage close to 100%. I don't do that as part of my benchmarking. I did it a > couple times just to see it happening, and actually none of these times > took riak down. > > > Do you have swap disabled? > > A: Nope. Why you would recommend doing so? Could there be any downsides? > > > You can read about our rationale for this here [1]. > > [1] http://docs.basho.com/riak/latest/ops/tuning/linux/ > > > > On Wed, Sep 25, 2013 at 2:18 PM, Reid Draper wrote: > >> >> On Sep 25, 2013, at 3:39 PM, Wagner Camarao >> wrote: >> >> Reid, >> Thanks for bringing that up and yes I'm using paginated 2i. >> >> >> * What size page are you using (ie. how many results per query)? >> * When you said the results would sometimes take three minutes, is that >> per page, or to paginate through all of the results? >> * Are you observing CPU and memory utilization while these stalls happen? >> * Do you have swap disabled? >> >> Reid >> >> > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Issue with 2i sorting on Linux
I'm using 2i to model "a document has many revisions" so the index name is a lower case string and the indexed keys are formatted as timestamp_uuid. When I query 2i to grab last revisions on OSX they're always returned correctly, as per riak docs say 2i sorting happens first by index name and then by key. But when I run the same on Linux the sorting order comes random. Any idea what I could be doing wrong? Any known issues on 2i sorting on Linux? Which more details could I provide? Have thought of OSX case insensitiveness, but index names and keys are all lower case. Have checked timestamps have same number of digits as they are concatenated with uuids. Running riak-1.4.2 with backend storage set to eleveldb in app.config on both Linux and OSX. Thanks ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com