Thanks David, If there is anything I can do from this end to help please don't hesitate to ask.
--gordon On Jun 7, 2011, at 15:34 , David Smith wrote: > Gordon, > > Thanks for the test case. I've queued it up for review by a dev, as > time permits. > > D. > > On Tue, Jun 7, 2011 at 1:33 PM, Gordon Tillman <gtill...@mezeo.com> wrote: >> Guys I have put together a simple test to reproduce the error that we are >> seeing. >> It is on github here: >> https://github.com/gordyt/riaksearch-test >> This is a multi-threaded test that connects to Riak using the protocol >> buffers interface. Each iteration of the run loop issues one simple search >> and uploads one small json object. >> Thanks very much for any input you might have. >> Regards, >> --gordon >> On Jun 6, 2011, at 10:01 , Gordon Tillman wrote: >> >> Good Morning Gilbert, >> I have posted this gist: >> https://gist.github.com/1010384 >> It is a minor update we made to it_op_collector_loop/3 in >> riak_search_op_utils. This update was done to alleviate the situation that >> we observe here: >> https://gist.github.com/1000735 >> But it was made with the understanding that this is treating a symptom and >> not fixing the cause of the problem. >> A little bit of followup information: The problem seems to be exacerbated >> when Riak is hit with a series of operations that are all generating the >> same search/map/reduce operation (albeit with differing search input >> parameters). >> We installed 0.14.2 and tested this weekend (without our update applied) and >> observed the same issues. >> If I found out anything else I will let you know. >> --gordon >> >> >> >> On May 31, 2011, at 18:09 , Gilbert Glåns wrote: >> >> Gordon, >> >> Great news! Much appreciated. >> >> Gilbert >> >> On Tue, May 31, 2011 at 2:25 PM, Gordon Tillman <gtill...@mezeo.com> wrote: >> >> Howdy Gilbert, >> >> Hey we are testing a fix now. If this works I will send you a copy of the >> update file. >> >> --gordon >> >> >> On May 31, 2011, at 12:55 , Gilbert Glåns wrote: >> >> Hi Gordon, >> >> Thank you for sharing the information. We are seeing the same exact >> >> type of behavior from our search cluster. I have tracked the >> >> problem(s) though the query system. It looks like the mailboxes we >> >> are both seeing are "abandoned" and / or the messages are never >> >> matched within the Erlang code (it_op_collector_loop, >> >> riak_search_op_utils.erl); the messages are then never processed, >> >> therefore the resources they utilize never released. This is a major >> >> problem. >> >> I have been debugging this for some time and I wish I could say it was >> >> going well. The implementation is convoluted -- have you gotten >> >> through it? Can you verify the same cause? >> >> We have been internally discussing the possibility of removing this >> >> query processing implementation completely and replacing it with >> >> something built in-house because the problems we have uncovered trying >> >> to debug the "abandoned mailbox" problem are related and systemic: 1) >> >> indeterminate and possibly very large data structures created and >> >> manipulated for intermediate and final sets of results, 2) very poor >> >> or non-existent ability to gain any insight into what is executing >> >> within the "plumbing" of the current query execution system without >> >> "herculean" effort (in my opinion), and 3) unacceptable performance >> >> (predictably or subjectively) from the merge_index riak_search >> >> backend. >> >> Are there any other backends available for riak_search with the >> >> Enterprise Riak offering? I really like the design of riak_search but >> >> the performance seems to be only a very small fraction of our >> >> equivalent SOLR installation, even with several times the amount of >> >> resources "thrown at it" -- it does not seem to use resources we >> >> "throw at it" well, either, or in the mailboxes case, responsibly. >> >> I will quickly admit I may be doing something wrong. Is there a >> >> user-error situation in which mailboxes should be abandoned taking up >> >> memory? >> >> Does anyone else have experiences with equivalent riak_search vs. SOLR >> >> installations? >> >> Thanks again for sharing Gordon. Your results make me feel like this >> >> may not be entirely stupidity on my part. >> >> Gilbert >> >> >> On Tue, May 31, 2011 at 8:51 AM, Gordon Tillman <gtill...@mezeo.com> wrote: >> >> Howdy Gilbert, >> >> I reproduced the issue this morning and then ran the command that you >> >> specified on two of the non-empty mailboxes. >> >> The output from that is posted here: >> >> https://gist.github.com/1000735 >> >> Please let me know if this corresponds to the issue that you are seeing. >> >> Thank you, >> >> --gordon >> >> On May 27, 2011, at 20:10 , Gilbert Glåns wrote: >> >> Gordon, >> >> Could you try: >> >> erlang:process_info(list_to_pid("<0.16614.32>"), [messages, >> >> current_function, initial_call, links, memory, status]). >> >> in a riak search console for one/some of those mailboxes and share the >> >> results? I am curious to see if you are having the same systemic >> >> memory consumption I am experiencing. >> >> Gilbert >> >> On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman <gtill...@mezeo.com> wrote: >> >> Howdy Gang, >> >> We are having a bit of an issue with our 3-node riaksearch cluster. What is >> >> happing is this: >> >> Cluster is up and running. We start testing our application against it. As >> >> the application runs the erlang process consumes more and more memory >> >> without ever releasing it. >> >> In trying to investigate the issue we ran the riaksearch-admin cluster_info >> >> command. It appears that the bulk of this memory is being consumed by a >> >> bunch of mailboxes. >> >> I have posted both the output of the cluster_info command and the app.config >> >> from one of the nodes here: >> >> https://gist.github.com/996419 >> >> I would be very grateful if someone from Basho would take a look at the >> >> cluster_info and see if they can spot anything obvious. >> >> Each machine in the cluster has an 8-core Xeon and 16GB RAM. I believe all >> >> of the platform details, etc., are in the cluster_info dump. >> >> Many thanks, >> >> --gordon >> >> _______________________________________________ >> >> riak-users mailing list >> >> riak-users@lists.basho.com >> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > > > -- > Dave Smith > Director, Engineering > Basho Technologies, Inc. > diz...@basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com