new cluster -- riak starts and dies on each vm.
I'm setting up a new small riak cluster - 8 VMs, 4GB each. Riak starts and dies after a few seconds on each VM. I'd like to figure out why. I'm running riak 1.4.1, built from source on ubuntu 12.04 (32 bit). I have AAE and search turned off and my backend is set to leveldb. Ring size is 64. All the other settings are default. The crash.log files show errors that look like these: 2013-09-14 22:48:25 =SUPERVISOR REPORT Supervisor: {local,riak_core_vnode_proxy_sup} Context:shutdown_error Reason: {{{function_clause,[{riak_kv_vnode,terminate,[{bad_return_value,{stop,{db_open,"IO error: ./data/leveldb/1027618338748291114361965898003636498195577569280/02.dbtmp: Cannot allocate memory"}}},undefined],[{file,"src/riak_kv_vnode.erl"},{line,838}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,849}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,586}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},{gen_fsm,sync_send_event,[<0.19541.3>,wait_for_init,infinity]}},{gen_server,call,[riak_core_vnode_manager,{205523667749658222872393179600727299639115513856,riak_kv_vnode,get_vnode},infinity]}} Offender: [{pid,<0.471.0>},{name,{riak_kv_vnode,205523667749658222872393179600727299639115513856}},{mfargs,{riak_core_vnode_proxy,start_link,[riak_kv_vnode,205523667749658222872393179600727299639115513856]}},{restart_type,permanent},{shutdown,5000},{child_type,worker}] 2013-09-14 22:48:25 =SUPERVISOR REPORT Supervisor: {local,riak_core_sup} Context:child_terminated Reason: shutdown Offender: [{pid,<0.151.0>},{name,riak_core_vnode_proxy_sup},{mfargs,{riak_core_vnode_proxy_sup,start_link,[]}},{restart_type,permanent},{shutdown,5000},{child_type,supervisor}] I don't understand why riak "Cannot allocate memory" as my system memory use is under 20%. Is there a config parameter I must change? Thanks, Daniel ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Performance testing different virtual machine cluster configurations
Hi group, I've been happily using Riak in production on VMs for years. It's time to scale up so I'm running performance tests with basho_bench and my own app code on a new VM cluster. Here's an example question I'd like to answer: will a cluster perform better with 5 VMs at 8gb each, or 10 VMs at 4gb each? The cost of the VMs is the same, and I'm guessing the performance will not be exactly the same. As I'm going through the trouble, I'll share my results. I hope this will be interesting to other people, especially with all the changes in Riak 2.0. I really like Riak so I want to make sure I give it a fair showing. Please review my config and initial results below. So far I've set up the first cluster of 5 VMs with 8gb each. I'm driving the test off a separate (6th) VM. I anticipate I'll set up the second cluster with 10 VMs at 4GB each in the next few days. Please let me know if you have any suggestions. Thanks! Daniel -- initial results: basho_bench riakc_pb for 5vm cluster, 8gb on each machine: https://www.dropbox.com/s/mpt9ame3t082lct/5vm-8gb-riakc_pb.png basho_bench counters for 5vm cluster, 8gb on each machine: https://www.dropbox.com/s/8eyp1zbpy5h16s1/5vm-8gb-counters.png riak.conf used for these tests: https://www.dropbox.com/s/krufixa2wmnaxum/base.riak.conf.txt I'm currently loading records in to the cluster through my app code. After 18h and 68M records, they're still going in at 1k records/second. The records are around 750 bytes each. They're each indexed with either 4 or 6 different 2i values. Search indexing is on as well. I'm going to use this test dataset to compare 2i vs search for my application. system stuff: VM plan through linode: 8 GB RAM 6 CPU Cores 192 GB SSD Storage 8 TB Transfer 40 Gbit Network In 1000 Mbit Network Out filesystem: /dev/xvda on / type ext4 (rw,noatime,errors=remount-ro) file limits: riak soft nofile 65536 riak hard nofile 65536 root soft nofile 65536 root hard nofile 65536 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
simple riaksearch mapreduce search input syntax question
Hi list, I've got a simple question of syntax, and I can't find it documented anywhere. I have a working riaksearch javascript mapreduce call that looks like the following: {"inputs": { "module":"riak_search", "function":"mapred_search", "arg":["tub-0.3","title:Daniel"] }, "query":[{"map":{"language":"javascript","source":"function(v) { return [v.values[0].data]; }","keep":true}}] } I simply want to limit the number of results returned. I'd like to do this with the 'rows' parameter into search (as opposed to a limit in the map phase). I've tried adding a "&rows=10" to my "arg" value after "title:Daniel", but this doesn't work. What's the solution? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
riaksearch performance, row limit, sorting not necessary
Hi list, I'm wondering how riaksearch performance will degrade as I add documents. For my purpose I limit rows at 1k and sorting is not necessary. I have a single node cluster for development. I know I can increase performance if I add nodes but I'd like to understand this before I do. My documents are small ~200 bytes. With an index of 30k and rows limited to 1k, no problems. I added 100k documents, and then I hit the too_many_results error. Since I still have my row limit set at 1k, this indicates that the query does not return as soon as it finds the first 1k hits. Is there a way to short circuit my queries so that they don't have to scan the whole index? I got around too_many_results by increasing my max_search_results (I read https://help.basho.com/entries/480664-i-get-the-error-too-many-results). I wonder, though, if I'll keep bumping memory boundaries as I add a few million docs to my index. Thanks, Daniel ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: riaksearch performance, row limit, sorting not necessary
To be clear, I'm only talking about the solr interface. I'm wondering if my query time will remain fixed (since it's capped at rows=1000) as I add several million docs to the index. If I use my search as an input into Map/Reduce, won't my response time grow with my index? My search query would queue up a very large result set - and I expect performance to suffer if I trim this down in a reduce phase. It would seem that I can prevent that slowdown by limiting the rows in the search (with rows=1000). Despite that limit, though, I hit the too_many_results error which indicates that the search queues up a very large result set before it applies the row limit. Is there something I'm missing here? thanks, Daniel Basically, I'm wondering if my query time will remain On Thu, Apr 14, 2011 at 7:53 AM, Gordon Tillman wrote: > Daniel the max_search_results only applies to searches done via the solr > interface. From > http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html > : > > - System now aborts queries that would queue up too many documents in > a result set. This is controlled by a 'max_search_results' setting > in riak_search. Note that this only affects the Solr > interface. Searches through the Riak Client API that feed into a > Map/Reduce job are still allowed to execute because the system > streams those results. > > > So you can use a map-reduce operation (with the search phase providing the > inputs) and you should be OK. > > --gordon > > > <http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html> > On Apr 14, 2011, at 04:49 , Daniel Rathbone wrote: > > Hi list, > > I'm wondering how riaksearch performance will degrade as I add documents. > > For my purpose I limit rows at 1k and sorting is not necessary. I have a > single node cluster for development. I know I can increase performance if I > add nodes but I'd like to understand this before I do. > > My documents are small ~200 bytes. With an index of 30k and rows limited > to 1k, no problems. I added 100k documents, and then I hit > the too_many_results error. Since I still have my row limit set at 1k, this > indicates that the query does not return as soon as it finds the first 1k > hits. Is there a way to short circuit my queries so that they don't have to > scan the whole index? > > I got around too_many_results by increasing my max_search_results (I read > https://help.basho.com/entries/480664-i-get-the-error-too-many-results). > I wonder, though, if I'll keep bumping memory boundaries as I add a few > million docs to my index. > > Thanks, > Daniel > > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Attempting to use secondary indexes
I'm trying out 2i, and I'm having a little trouble similar to this old thread. I don't see any keys returned when I try and query a secondary index. My storage_backend is set to riak_kv_eleveldb_backend. I seem to have added values to the secondary index well enough - I can see a header "X-Riak-Meta-All: 2" returned when I request the object by it's regular key. When I try and query the 2i, curl -v "127.0.0.1:8071/buckets/cpg_mb/index/all_int/0/5" I simply get {"keys":[]} Is there any other configuration I need to set? I simply changed my backend to ELevelDB and restarted my (single) node. Any other ideas? Thanks, Daniel On Thu, Aug 4, 2011 at 10:35 AM, Craig Muth wrote: > > you'll need to look in etc/app.config in the release directory and > change the value for the storage_backend setting to riak_kv_index_backend > > Thanks. > > How stable is the riak_kv_index_backend? If we're aiming to go to prod in > a couple months would we be better advised to use riak search? It seems as > though it's functionality is a superset of secondary indexing, though more > painful to implement. > > --Craig > > > > On Wed, Aug 3, 2011 at 9:48 PM, Jeremiah Peschka < > jeremiah.pesc...@gmail.com> wrote: > >> I believe 1.0 is scheduled for November. >> >> --- >> Jeremiah Peschka >> Founder, Brent Ozar PLF, LLC >> >> On Aug 3, 2011, at 9:47 PM, Antonio Rohman Fernandez wrote: >> >> > Another question... the old problem of querying smaller buckets with >> MapReduce is resolved with secondary indexes? >> > following this pattern => curl >> http://127.0.0.1:8098/buckets/loot/index/category_bin/eq/armor i think >> Riak doesn't have to check all the buckets/keys in memory to match the one >> you look for as it happened on a MapReduce, right? so >> http://127.0.0.1:8098/buckets/rohman_messages/index/date_int/eq/20110801would >> give me all messages from Rohman ( in a personalized bucket ) for the >> day 1st of August of 2011 in a much faster way, right? >> > thanks >> > >> > Rohman >> > >> > On Thu, 04 Aug 2011 12:38:17 +0800, Antonio Rohman Fernandez wrote: >> > >> >> seeing OSCON's PDF... when will we be able to have Riak 1.0 with >> secondary indexes out? This is an improvement that can help me pretty well >> out on my project. Any ETA? >> >> thanks >> >> >> >> Rohman >> >> >> >> On Wed, 3 Aug 2011 22:02:53 -0600, Kelly McLaughlin wrote: >> >> >> >> Craig, >> >> The default backend is bit cask and if you want to use the indexes >> you'll need to look in etc/app.config in the release directory and change >> the value for the storage_backend setting to riak_kv_index_backend instead >> of riak_kv_bitcask_backend. I suspect that's the problem. Cheers. >> >> Kelly >> >> >> >> On Aug 3, 2011, at 9:42 PM, Craig Muth wrote: >> >> >> >> I'm running the code example from >> http://rusty.basho.com.s3.amazonaws.com/Presentations/2011-OSCONData-Portland.pdf >> >> This succeeds: >> >> curl \ >> >> -X PUT \ >> >> -d "OPAQUE_VALUE" \ >> >> -H "x-riak-index-category_bin: armor" \ >> >> -H "x-riak-index-price_int: 400" \ >> >> http://127.0.0.1:8098/buckets/loot/keys/gauntlet24 >> >> This verifies it's getting there: >> >> curl http://127.0.0.1:8098/riak/loot/gauntlet24 >> >> => OPAQUE_VALUE >> >> However, this finds nothing: >> >> curl http://127.0.0.1:8098/buckets/loot/index/category_bin/eq/armor >> >> => {"keys":[]} >> >> Whereas in the presentation it says it should return >> {"keys":["gauntlet24"]} >> >> Any ideas? Just grabbed the latest from github master. >> >> --Craig >> >> ___ >> >> riak-users mailing list >> >> riak-users@lists.basho.com >> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> -- >> >> >> >> Antonio Rohman Fernandez >> >> CEO, Founder & Lead Engineer >> >> roh...@mahalostudio.com Projects >> >> MaruBatsu.es >> >> PupCloud.com >> >> Wedding Album >> > -- >> > >> > Antonio Rohman Fernandez >> > CEO, Founder & Lead Engineer >> > roh...@mahalostudio.com Projects >> > MaruBatsu.es >> > PupCloud.com >> > Wedding Album >> > ___ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- sent from Dan Rathbone's personal email account http://danrathbone.com -- personal site http://eastlakeblog.org -- neighborhood blog ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Attempting to use secondary indexes
That cleared it up, thanks! Indeed, I was trying to add the index k/v as regular meta in the headers. On Mon, Jan 9, 2012 at 12:37 AM, Russell Brown wrote: > Hi Daniel, > > On 9 Jan 2012, at 08:22, Daniel Rathbone wrote: > > I'm trying out 2i, and I'm having a little trouble similar to this old > thread. I don't see any keys returned when I try and query a secondary > index. > > My storage_backend is set to riak_kv_eleveldb_backend. I seem to have > added values to the secondary index well enough - I can see a header > "X-Riak-Meta-All: 2" returned when I request the object by it's regular > key. > > > How did you add an index to the object? A header of 'X-Riak-Meta-All' > suggests that you added some user meta, I would hope to see a header like > 'x-riak-index-all_int' in the response. You can add a secondary index like > this > > curl -X POST -H 'x-riak-index-all_int: 2' -d 'YOUR DATA' > http://localhost:8071/buckets/YOUR_BUCKET/keys/YOUR_KEY > > When I try and query the 2i, > > curl -v "127.0.0.1:8071/buckets/cpg_mb/index/all_int/0/5" > > > Your query looks right. You can read more on the Basho wiki here[1] and > here[2] > > Cheers > > Russell > > [1] http://wiki.basho.com/HTTP-Store-Object.html > [2] http://wiki.basho.com/HTTP-Secondary-Indexes.html > > > I simply get > > {"keys":[]} > > Is there any other configuration I need to set? I simply changed my > backend to ELevelDB and restarted my (single) node. Any other ideas? > > Thanks, > Daniel > > On Thu, Aug 4, 2011 at 10:35 AM, Craig Muth wrote: > >> > you'll need to look in etc/app.config in the release directory and >> change the value for the storage_backend setting to riak_kv_index_backend >> >> Thanks. >> >> How stable is the riak_kv_index_backend? If we're aiming to go to prod >> in a couple months would we be better advised to use riak search? It seems >> as though it's functionality is a superset of secondary indexing, though >> more painful to implement. >> >> --Craig >> >> >> >> On Wed, Aug 3, 2011 at 9:48 PM, Jeremiah Peschka < >> jeremiah.pesc...@gmail.com> wrote: >> >>> I believe 1.0 is scheduled for November. >>> >>> --- >>> Jeremiah Peschka >>> Founder, Brent Ozar PLF, LLC >>> >>> On Aug 3, 2011, at 9:47 PM, Antonio Rohman Fernandez wrote: >>> >>> > Another question... the old problem of querying smaller buckets with >>> MapReduce is resolved with secondary indexes? >>> > following this pattern => curl >>> http://127.0.0.1:8098/buckets/loot/index/category_bin/eq/armor i think >>> Riak doesn't have to check all the buckets/keys in memory to match the one >>> you look for as it happened on a MapReduce, right? so >>> http://127.0.0.1:8098/buckets/rohman_messages/index/date_int/eq/20110801would >>> give me all messages from Rohman ( in a personalized bucket ) for the >>> day 1st of August of 2011 in a much faster way, right? >>> > thanks >>> > >>> > Rohman >>> > >>> > On Thu, 04 Aug 2011 12:38:17 +0800, Antonio Rohman Fernandez wrote: >>> > >>> >> seeing OSCON's PDF... when will we be able to have Riak 1.0 with >>> secondary indexes out? This is an improvement that can help me pretty well >>> out on my project. Any ETA? >>> >> thanks >>> >> >>> >> Rohman >>> >> >>> >> On Wed, 3 Aug 2011 22:02:53 -0600, Kelly McLaughlin wrote: >>> >> >>> >> Craig, >>> >> The default backend is bit cask and if you want to use the indexes >>> you'll need to look in etc/app.config in the release directory and change >>> the value for the storage_backend setting to riak_kv_index_backend instead >>> of riak_kv_bitcask_backend. I suspect that's the problem. Cheers. >>> >> Kelly >>> >> >>> >> On Aug 3, 2011, at 9:42 PM, Craig Muth wrote: >>> >> >>> >> I'm running the code example from >>> http://rusty.basho.com.s3.amazonaws.com/Presentations/2011-OSCONData-Portland.pdf >>> >> This succeeds: >>> >> curl \ >>> >> -X PUT \ >>> >> -d "OPAQUE_VALUE" \ >>> >> -H "x-riak-index-category_bin: armor" \ >>> >> -H "x-riak-index-price_int: 400" \ >>