Re: riaksearch memory growth issues

2011-05-31 Thread Gordon Tillman
Howdy Gilbert,

I reproduced the issue this morning and then ran the command that you specified 
on two of the non-empty mailboxes.

The output from that is posted here:

https://gist.github.com/1000735

Please let me know if this corresponds to the issue that you are seeing.

Thank you,

--gordon


On May 27, 2011, at 20:10 , Gilbert Glåns wrote:

Gordon,
Could you try:

erlang:process_info(list_to_pid("<0.16614.32>"), [messages,
current_function, initial_call, links, memory, status]).

in a riak search console for one/some of those mailboxes and share the
results? I am curious to see if you are having the same systemic
memory consumption I am experiencing.

Gilbert

On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman 
mailto:gtill...@mezeo.com>> wrote:

Howdy Gang,

We are having a bit of an issue with our 3-node riaksearch cluster.  What is 
happing is this:

Cluster is up and running.  We start testing our application against it.  As 
the application runs the erlang process consumes more and more memory without 
ever releasing it.

In trying to investigate the issue we ran the riaksearch-admin cluster_info 
command.  It appears that the bulk of this memory is being consumed by a bunch 
of mailboxes.

I have posted both the output of the cluster_info command and the app.config 
from one of the nodes here:

https://gist.github.com/996419

I would be very grateful if someone from Basho would take a look at the 
cluster_info and see if they can spot anything obvious.

Each machine in the cluster has an 8-core Xeon and 16GB RAM.  I believe all of 
the platform details, etc., are in the cluster_info dump.

Many thanks,

--gordon
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: haproxy and protobuffers working example?

2011-05-31 Thread Mark Phillips
Hey Bob,

Any objections if I clean this up and add it to the "Other" section of
Riak Function Contrib? It's clearly not a function, but I think others
would find this useful (as Scott did) as a point of reference.

http://contrib.basho.com/other-functions.html

Let me know and I'll throw together a pull request for your review.

Mark

On Mon, May 30, 2011 at 4:55 PM, Bob Feldbauer  wrote:
> Your haproxy.cfg looks fine, but since you asked, here's the my haproxy.cfg
> for Riak use:
>
> global
>        user haproxy
>        group haproxy
>        daemon
>        maxconn 10
>
> defaults
>        log     global
>        mode    tcp
>        option  tcplog
>        option  dontlognull
>        balance leastconn
>        clitimeout      6
>        srvtimeout      6
>        contimeout      5000
>        retries  3
>        option  redispatch
>        option contstats
>        stats enable
>
> listen riak 192.168.1.1:8087
>        server riak1 riak1:8087 weight 1 maxconn 1000
>        server riak2 riak2:8087 weight 1 maxconn 1000
>        server riak3 riak3:8087 weight 1 maxconn 1000
>
>
> Good luck!
>
> - Bob Feldbauer
>
> On 5/30/2011 7:11 PM, Scott M. Likens wrote:
>>
>> Hey,
>>
>> So I thought I had a working haproxy configuration with protobuffers (It
>> worked once upon a time, but no longer) so I was wondering if anyone had any
>> working examples?
>>
>> 
>> irb(main):001:0>  client = Riak::Client.new(:protocol =>  "pbc")
>> =>  #
>> irb(main):002:0>  client.ping
>> Riak::ProtobuffsFailedRequest: Expected success from Riak but received
>> server_error. Unexpected EOF on PBC socket
>> 
>>
>> 
>> irb(main):008:0>  client2=Riak::Client.new(:protocol =>  "pbc", :host =>
>>  "10.170.121.116")
>> =>  #
>> irb(main):009:0>  client2.ping
>> =>  true
>> 
>>
>> If I remove all the servers except for 1 connects but still doesn't really
>> work, below is my haproxy configuration I hope it helps.
>>
>>   listen riak_pbc :8087
>>   mode tcp
>>   server app-0 ip-10-170-121-116.us-west-1.compute.internal:8087 check
>> inter 5000 fastinter 1000 fall 1 weight 50
>>   server app-1 ip-10-171-43-226.us-west-1.compute.internal:8087 check
>> inter 5000 fastinter 1000 fall 1 weight 50
>>   server app-2 ip-10-170-142-202.us-west-1.compute.internal:8087 check
>> inter 5000 fastinter 1000 fall 1 weight 50
>>
>> Actual full haproxy erb chef template can be found here
>> https://github.com/damm/ey-riak/blob/master/templates/default/haproxy.cfg.erb
>> (if you see anything I should turn off for http please let me know and be
>> gentle)
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: haproxy and protobuffers working example?

2011-05-31 Thread Sean Cribbs
Actually, I think it should be part of the wiki, and have opened an issue to 
that effect. https://github.com/basho/riak_wiki/issues/106

Sean Cribbs 
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On May 31, 2011, at 12:05 PM, Mark Phillips wrote:

> Hey Bob,
> 
> Any objections if I clean this up and add it to the "Other" section of
> Riak Function Contrib? It's clearly not a function, but I think others
> would find this useful (as Scott did) as a point of reference.
> 
> http://contrib.basho.com/other-functions.html
> 
> Let me know and I'll throw together a pull request for your review.
> 
> Mark
> 
> On Mon, May 30, 2011 at 4:55 PM, Bob Feldbauer  
> wrote:
>> Your haproxy.cfg looks fine, but since you asked, here's the my haproxy.cfg
>> for Riak use:
>> 
>> global
>>user haproxy
>>group haproxy
>>daemon
>>maxconn 10
>> 
>> defaults
>>log global
>>modetcp
>>option  tcplog
>>option  dontlognull
>>balance leastconn
>>clitimeout  6
>>srvtimeout  6
>>contimeout  5000
>>retries  3
>>option  redispatch
>>option contstats
>>stats enable
>> 
>> listen riak 192.168.1.1:8087
>>server riak1 riak1:8087 weight 1 maxconn 1000
>>server riak2 riak2:8087 weight 1 maxconn 1000
>>server riak3 riak3:8087 weight 1 maxconn 1000
>> 
>> 
>> Good luck!
>> 
>> - Bob Feldbauer
>> 
>> On 5/30/2011 7:11 PM, Scott M. Likens wrote:
>>> 
>>> Hey,
>>> 
>>> So I thought I had a working haproxy configuration with protobuffers (It
>>> worked once upon a time, but no longer) so I was wondering if anyone had any
>>> working examples?
>>> 
>>> 
>>> irb(main):001:0>  client = Riak::Client.new(:protocol =>  "pbc")
>>> =>  #
>>> irb(main):002:0>  client.ping
>>> Riak::ProtobuffsFailedRequest: Expected success from Riak but received
>>> server_error. Unexpected EOF on PBC socket
>>> 
>>> 
>>> 
>>> irb(main):008:0>  client2=Riak::Client.new(:protocol =>  "pbc", :host =>
>>>  "10.170.121.116")
>>> =>  #
>>> irb(main):009:0>  client2.ping
>>> =>  true
>>> 
>>> 
>>> If I remove all the servers except for 1 connects but still doesn't really
>>> work, below is my haproxy configuration I hope it helps.
>>> 
>>>   listen riak_pbc :8087
>>>   mode tcp
>>>   server app-0 ip-10-170-121-116.us-west-1.compute.internal:8087 check
>>> inter 5000 fastinter 1000 fall 1 weight 50
>>>   server app-1 ip-10-171-43-226.us-west-1.compute.internal:8087 check
>>> inter 5000 fastinter 1000 fall 1 weight 50
>>>   server app-2 ip-10-170-142-202.us-west-1.compute.internal:8087 check
>>> inter 5000 fastinter 1000 fall 1 weight 50
>>> 
>>> Actual full haproxy erb chef template can be found here
>>> https://github.com/damm/ey-riak/blob/master/templates/default/haproxy.cfg.erb
>>> (if you see anything I should turn off for http please let me know and be
>>> gentle)
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


riak not starting, Reason: {badmatch,already_exists}

2011-05-31 Thread David Mitchell
Hi, one of the riak nodes is not starting.  I am getting  
{badmatch,already_exists} in sasl-error.log.

At first, I was getting 'eaccess' errors, so I changed all of the files to be 
owned by "root:root".  Now, I am getting,
{badmatch,already_exists}.

In, erlang.log.1, I see:
[{application,riak_kv},{exited,{shutdown,{riak_kv_app,start,[normal,[]]}}},{type,permanent}]/home/me/dev/poc/riak/riak/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
 Erlang has closed.
Erlang has closed
{"Kernel pid 
terminated",application_controller,"{application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}}"}

Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller) 
({application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}})

In my etc/vm.args, -name riak@10.0.60.208 (which is a 
unique name).

Any ideas on why I cannot start riak?

David

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak not starting, Reason: {badmatch,already_exists}

2011-05-31 Thread Dan Reverri
Hi David,

Can you check to see if you have any beam processes running (perhaps another
Riak instance)?

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
d...@basho.com


On Tue, May 31, 2011 at 9:17 AM, David Mitchell
wrote:

> Hi, one of the riak nodes is not starting.  I am getting
>  {badmatch,already_exists} in sasl-error.log.
>
>
>
> At first, I was getting ‘eaccess’ errors, so I changed all of the files to
> be owned by “root:root”.  Now, I am getting,
>
> {badmatch,already_exists}.
>
>
>
> In, erlang.log.1, I see:
>
> [{application,riak_kv},{exited,{shutdown,{riak_kv_app,start,[normal,[]]}}},{type,permanent}]/home/me/dev/poc/riak/riak/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
> Erlang has closed.
>
> Erlang has closed
>
> {"Kernel pid
> terminated",application_controller,"{application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}}"}
>
>
>
> Crash dump was written to: erl_crash.dump
>
> Kernel pid terminated (application_controller)
> ({application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}})
>
>
>
> In my etc/vm.args, -name riak@10.0.60.208 (which is a unique name).
>
>
>
> Any ideas on why I cannot start riak?
>
>
>
> David
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riaksearch memory growth issues

2011-05-31 Thread Gilbert Glåns
Hi Gordon,
Thank you for sharing the information.  We are seeing the same exact
type of behavior from our search cluster.  I have tracked the
problem(s) though the query system.  It looks like the mailboxes we
are both seeing are "abandoned" and / or the messages are never
matched within the Erlang code (it_op_collector_loop,
riak_search_op_utils.erl); the messages are then never processed,
therefore the resources they utilize never released.  This is a major
problem.

I have been debugging this for some time and I wish I could say it was
going well.  The implementation is convoluted -- have you gotten
through it?  Can you verify the same cause?

We have been internally discussing the possibility of removing this
query processing implementation completely and replacing it with
something built in-house because the problems we have uncovered trying
to debug the "abandoned mailbox" problem are related and systemic:  1)
indeterminate and possibly very large data structures created and
manipulated for intermediate and final sets of results, 2) very poor
or non-existent ability to gain any insight into what is executing
within the "plumbing" of the current query execution system without
"herculean" effort (in my opinion), and 3) unacceptable performance
(predictably or subjectively) from the merge_index riak_search
backend.

Are there any other backends available for riak_search with the
Enterprise Riak offering?  I really like the design of riak_search but
the performance seems to be only a very small fraction of our
equivalent SOLR installation, even with several times the amount of
resources "thrown at it" -- it does not seem to use resources we
"throw at it" well, either, or in the mailboxes case, responsibly.

I will quickly admit I may be doing something wrong.  Is there a
user-error situation in which mailboxes should be abandoned taking up
memory?

Does anyone else have experiences with equivalent riak_search vs. SOLR
installations?

Thanks again for sharing Gordon.  Your results make me feel like this
may not be entirely stupidity on my part.

Gilbert


On Tue, May 31, 2011 at 8:51 AM, Gordon Tillman  wrote:
> Howdy Gilbert,
> I reproduced the issue this morning and then ran the command that you
> specified on two of the non-empty mailboxes.
> The output from that is posted here:
> https://gist.github.com/1000735
> Please let me know if this corresponds to the issue that you are seeing.
> Thank you,
> --gordon
>
> On May 27, 2011, at 20:10 , Gilbert Glåns wrote:
>
> Gordon,
> Could you try:
>
> erlang:process_info(list_to_pid("<0.16614.32>"), [messages,
> current_function, initial_call, links, memory, status]).
>
> in a riak search console for one/some of those mailboxes and share the
> results? I am curious to see if you are having the same systemic
> memory consumption I am experiencing.
>
> Gilbert
>
> On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman  wrote:
>
> Howdy Gang,
>
> We are having a bit of an issue with our 3-node riaksearch cluster.  What is
> happing is this:
>
> Cluster is up and running.  We start testing our application against it.  As
> the application runs the erlang process consumes more and more memory
> without ever releasing it.
>
> In trying to investigate the issue we ran the riaksearch-admin cluster_info
> command.  It appears that the bulk of this memory is being consumed by a
> bunch of mailboxes.
>
> I have posted both the output of the cluster_info command and the app.config
> from one of the nodes here:
>
> https://gist.github.com/996419
>
> I would be very grateful if someone from Basho would take a look at the
> cluster_info and see if they can spot anything obvious.
>
> Each machine in the cluster has an 8-core Xeon and 16GB RAM.  I believe all
> of the platform details, etc., are in the cluster_info dump.
>
> Many thanks,
>
> --gordon
>
> ___
>
> riak-users mailing list
>
> riak-users@lists.basho.com
>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


RE: riak not starting, Reason: {badmatch,already_exists}

2011-05-31 Thread David Mitchell
Hi Dan,

There are no other beam processes running.  The only thing running that is 
related to riak is "riak/erts-5.7.5/bin/epmd -daemon".

David



From: Dan Reverri [mailto:d...@basho.com]
Sent: Tuesday, May 31, 2011 12:37 PM
To: David Mitchell
Cc: riak-users@lists.basho.com
Subject: Re: riak not starting, Reason: {badmatch,already_exists}

Hi David,

Can you check to see if you have any beam processes running (perhaps another 
Riak instance)?

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
d...@basho.com

On Tue, May 31, 2011 at 9:17 AM, David Mitchell 
mailto:david.mitch...@ixicorp.com>> wrote:
Hi, one of the riak nodes is not starting.  I am getting  
{badmatch,already_exists} in sasl-error.log.

At first, I was getting 'eaccess' errors, so I changed all of the files to be 
owned by "root:root".  Now, I am getting,
{badmatch,already_exists}.

In, erlang.log.1, I see:
[{application,riak_kv},{exited,{shutdown,{riak_kv_app,start,[normal,[]]}}},{type,permanent}]/home/me/dev/poc/riak/riak/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
 Erlang has closed.
Erlang has closed
{"Kernel pid 
terminated",application_controller,"{application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}}"}

Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller) 
({application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}})

In my etc/vm.args, -name riak@10.0.60.208 (which is a 
unique name).

Any ideas on why I cannot start riak?

David


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak not starting, Reason: {badmatch,already_exists}

2011-05-31 Thread Dan Reverri
Hi David,

Can you confirm you have built Riak from source? What branch/tag are you
using? What version of Erlang are you using?

I would suggest undoing the "root:root" ownership change. Do you have any
data in the Riak instance or is this a fresh install? How are you starting
Riak (the exact command)? Can you provide the full set of log files from the
node?

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
d...@basho.com


On Tue, May 31, 2011 at 11:14 AM, David Mitchell  wrote:

> Hi Dan,
>
>
>
> There are no other beam processes running.  The only thing running that is
> related to riak is “riak/erts-5.7.5/bin/epmd –daemon”.
>
>
>
> David
>
>
>
>
>
>
>
> *From:* Dan Reverri [mailto:d...@basho.com]
> *Sent:* Tuesday, May 31, 2011 12:37 PM
> *To:* David Mitchell
> *Cc:* riak-users@lists.basho.com
> *Subject:* Re: riak not starting, Reason: {badmatch,already_exists}
>
>
>
> Hi David,
>
>
>
> Can you check to see if you have any beam processes running (perhaps
> another Riak instance)?
>
>
>
> Thanks,
>
> Dan
>
>
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> d...@basho.com
>
> On Tue, May 31, 2011 at 9:17 AM, David Mitchell <
> david.mitch...@ixicorp.com> wrote:
>
> Hi, one of the riak nodes is not starting.  I am getting
>  {badmatch,already_exists} in sasl-error.log.
>
>
>
> At first, I was getting ‘eaccess’ errors, so I changed all of the files to
> be owned by “root:root”.  Now, I am getting,
>
> {badmatch,already_exists}.
>
>
>
> In, erlang.log.1, I see:
>
> [{application,riak_kv},{exited,{shutdown,{riak_kv_app,start,[normal,[]]}}},{type,permanent}]/home/me/dev/poc/riak/riak/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
> Erlang has closed.
>
> Erlang has closed
>
> {"Kernel pid
> terminated",application_controller,"{application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}}"}
>
>
>
> Crash dump was written to: erl_crash.dump
>
> Kernel pid terminated (application_controller)
> ({application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}})
>
>
>
> In my etc/vm.args, -name riak@10.0.60.208 (which is a unique name).
>
>
>
> Any ideas on why I cannot start riak?
>
>
>
> David
>
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak not starting, Reason: {badmatch,already_exists}

2011-05-31 Thread Dan Reverri
Hi David,

This looks similar to bug 989:
https://issues.basho.com/show_bug.cgi?id=989

Can you remove all mr_queue data files and restart the node?
rm data/mr_queue/*

This issue has been fixed as of Riak 0.14.1. You can perform a rolling
upgrade if you'd like to upgrade your Riak cluster:
http://wiki.basho.com/Rolling-Upgrades.html

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
d...@basho.com


On Tue, May 31, 2011 at 12:03 PM, David Mitchell  wrote:

> Hi Dan,
>
>
>
> Yes, it has been built from source, and it was running.
>
>
>
> We got it from git://github.com/basho/riak.git in Februrary: Riak 0.14.0,
> aka "Dakota."
>
>
>
> We installed from http://erlang.org/download/otp_src_R13B04.tar.gz
>
>
>
> OK,I undid the “root:root” change.
>
>
>
> Yes, there is some data.  I was writing to the second instance, and the
> data was propagated from the second instance to the other two instances.  My
> map/reduce functions were very slow, and I was getting error on the first
> instance.  So, I tried to restart the first instance.  Now, the first
> instance will not start.
>
>
>
> Attached are the logs.
>
>
>
> Thank you for your help.
>
>
>
> David
>
>
>
> *From:* Dan Reverri [mailto:d...@basho.com]
> *Sent:* Tuesday, May 31, 2011 2:38 PM
>
> *To:* David Mitchell
> *Cc:* riak-users@lists.basho.com
> *Subject:* Re: riak not starting, Reason: {badmatch,already_exists}
>
>
>
> Hi David,
>
>
>
> Can you confirm you have built Riak from source? What branch/tag are you
> using? What version of Erlang are you using?
>
>
>
> I would suggest undoing the "root:root" ownership change. Do you have any
> data in the Riak instance or is this a fresh install? How are you starting
> Riak (the exact command)? Can you provide the full set of log files from the
> node?
>
>
>
> Thanks,
>
> Dan
>
>
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> d...@basho.com
>
> On Tue, May 31, 2011 at 11:14 AM, David Mitchell <
> david.mitch...@ixicorp.com> wrote:
>
> Hi Dan,
>
>
>
> There are no other beam processes running.  The only thing running that is
> related to riak is “riak/erts-5.7.5/bin/epmd –daemon”.
>
>
>
> David
>
>
>
>
>
>
>
> *From:* Dan Reverri [mailto:d...@basho.com]
> *Sent:* Tuesday, May 31, 2011 12:37 PM
> *To:* David Mitchell
> *Cc:* riak-users@lists.basho.com
> *Subject:* Re: riak not starting, Reason: {badmatch,already_exists}
>
>
>
> Hi David,
>
>
>
> Can you check to see if you have any beam processes running (perhaps
> another Riak instance)?
>
>
>
> Thanks,
>
> Dan
>
>
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> d...@basho.com
>
> On Tue, May 31, 2011 at 9:17 AM, David Mitchell <
> david.mitch...@ixicorp.com> wrote:
>
> Hi, one of the riak nodes is not starting.  I am getting
>  {badmatch,already_exists} in sasl-error.log.
>
>
>
> At first, I was getting ‘eaccess’ errors, so I changed all of the files to
> be owned by “root:root”.  Now, I am getting,
>
> {badmatch,already_exists}.
>
>
>
> In, erlang.log.1, I see:
>
> [{application,riak_kv},{exited,{shutdown,{riak_kv_app,start,[normal,[]]}}},{type,permanent}]/home/me/dev/poc/riak/riak/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
> Erlang has closed.
>
> Erlang has closed
>
> {"Kernel pid
> terminated",application_controller,"{application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}}"}
>
>
>
> Crash dump was written to: erl_crash.dump
>
> Kernel pid terminated (application_controller)
> ({application_start_failure,riak_kv,{shutdown,{riak_kv_app,start,[normal,[]]}}})
>
>
>
> In my etc/vm.args, -name riak@10.0.60.208 (which is a unique name).
>
>
>
> Any ideas on why I cannot start riak?
>
>
>
> David
>
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riaksearch memory growth issues

2011-05-31 Thread Gordon Tillman
Howdy Gilbert,

Hey we are testing a fix now.  If this works I will send you a copy of the 
update file.

--gordon


On May 31, 2011, at 12:55 , Gilbert Glåns wrote:

> Hi Gordon,
> Thank you for sharing the information.  We are seeing the same exact
> type of behavior from our search cluster.  I have tracked the
> problem(s) though the query system.  It looks like the mailboxes we
> are both seeing are "abandoned" and / or the messages are never
> matched within the Erlang code (it_op_collector_loop,
> riak_search_op_utils.erl); the messages are then never processed,
> therefore the resources they utilize never released.  This is a major
> problem.
> 
> I have been debugging this for some time and I wish I could say it was
> going well.  The implementation is convoluted -- have you gotten
> through it?  Can you verify the same cause?
> 
> We have been internally discussing the possibility of removing this
> query processing implementation completely and replacing it with
> something built in-house because the problems we have uncovered trying
> to debug the "abandoned mailbox" problem are related and systemic:  1)
> indeterminate and possibly very large data structures created and
> manipulated for intermediate and final sets of results, 2) very poor
> or non-existent ability to gain any insight into what is executing
> within the "plumbing" of the current query execution system without
> "herculean" effort (in my opinion), and 3) unacceptable performance
> (predictably or subjectively) from the merge_index riak_search
> backend.
> 
> Are there any other backends available for riak_search with the
> Enterprise Riak offering?  I really like the design of riak_search but
> the performance seems to be only a very small fraction of our
> equivalent SOLR installation, even with several times the amount of
> resources "thrown at it" -- it does not seem to use resources we
> "throw at it" well, either, or in the mailboxes case, responsibly.
> 
> I will quickly admit I may be doing something wrong.  Is there a
> user-error situation in which mailboxes should be abandoned taking up
> memory?
> 
> Does anyone else have experiences with equivalent riak_search vs. SOLR
> installations?
> 
> Thanks again for sharing Gordon.  Your results make me feel like this
> may not be entirely stupidity on my part.
> 
> Gilbert
> 
> 
> On Tue, May 31, 2011 at 8:51 AM, Gordon Tillman  wrote:
>> Howdy Gilbert,
>> I reproduced the issue this morning and then ran the command that you
>> specified on two of the non-empty mailboxes.
>> The output from that is posted here:
>> https://gist.github.com/1000735
>> Please let me know if this corresponds to the issue that you are seeing.
>> Thank you,
>> --gordon
>> 
>> On May 27, 2011, at 20:10 , Gilbert Glåns wrote:
>> 
>> Gordon,
>> Could you try:
>> 
>> erlang:process_info(list_to_pid("<0.16614.32>"), [messages,
>> current_function, initial_call, links, memory, status]).
>> 
>> in a riak search console for one/some of those mailboxes and share the
>> results? I am curious to see if you are having the same systemic
>> memory consumption I am experiencing.
>> 
>> Gilbert
>> 
>> On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman  wrote:
>> 
>> Howdy Gang,
>> 
>> We are having a bit of an issue with our 3-node riaksearch cluster.  What is
>> happing is this:
>> 
>> Cluster is up and running.  We start testing our application against it.  As
>> the application runs the erlang process consumes more and more memory
>> without ever releasing it.
>> 
>> In trying to investigate the issue we ran the riaksearch-admin cluster_info
>> command.  It appears that the bulk of this memory is being consumed by a
>> bunch of mailboxes.
>> 
>> I have posted both the output of the cluster_info command and the app.config
>> from one of the nodes here:
>> 
>> https://gist.github.com/996419
>> 
>> I would be very grateful if someone from Basho would take a look at the
>> cluster_info and see if they can spot anything obvious.
>> 
>> Each machine in the cluster has an 8-core Xeon and 16GB RAM.  I believe all
>> of the platform details, etc., are in the cluster_info dump.
>> 
>> Many thanks,
>> 
>> --gordon
>> 
>> ___
>> 
>> riak-users mailing list
>> 
>> riak-users@lists.basho.com
>> 
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riaksearch memory growth issues

2011-05-31 Thread Gilbert Glåns
Gordon,

Great news!  Much appreciated.

Gilbert

On Tue, May 31, 2011 at 2:25 PM, Gordon Tillman  wrote:
> Howdy Gilbert,
>
> Hey we are testing a fix now.  If this works I will send you a copy of the 
> update file.
>
> --gordon
>
>
> On May 31, 2011, at 12:55 , Gilbert Glåns wrote:
>
>> Hi Gordon,
>> Thank you for sharing the information.  We are seeing the same exact
>> type of behavior from our search cluster.  I have tracked the
>> problem(s) though the query system.  It looks like the mailboxes we
>> are both seeing are "abandoned" and / or the messages are never
>> matched within the Erlang code (it_op_collector_loop,
>> riak_search_op_utils.erl); the messages are then never processed,
>> therefore the resources they utilize never released.  This is a major
>> problem.
>>
>> I have been debugging this for some time and I wish I could say it was
>> going well.  The implementation is convoluted -- have you gotten
>> through it?  Can you verify the same cause?
>>
>> We have been internally discussing the possibility of removing this
>> query processing implementation completely and replacing it with
>> something built in-house because the problems we have uncovered trying
>> to debug the "abandoned mailbox" problem are related and systemic:  1)
>> indeterminate and possibly very large data structures created and
>> manipulated for intermediate and final sets of results, 2) very poor
>> or non-existent ability to gain any insight into what is executing
>> within the "plumbing" of the current query execution system without
>> "herculean" effort (in my opinion), and 3) unacceptable performance
>> (predictably or subjectively) from the merge_index riak_search
>> backend.
>>
>> Are there any other backends available for riak_search with the
>> Enterprise Riak offering?  I really like the design of riak_search but
>> the performance seems to be only a very small fraction of our
>> equivalent SOLR installation, even with several times the amount of
>> resources "thrown at it" -- it does not seem to use resources we
>> "throw at it" well, either, or in the mailboxes case, responsibly.
>>
>> I will quickly admit I may be doing something wrong.  Is there a
>> user-error situation in which mailboxes should be abandoned taking up
>> memory?
>>
>> Does anyone else have experiences with equivalent riak_search vs. SOLR
>> installations?
>>
>> Thanks again for sharing Gordon.  Your results make me feel like this
>> may not be entirely stupidity on my part.
>>
>> Gilbert
>>
>>
>> On Tue, May 31, 2011 at 8:51 AM, Gordon Tillman  wrote:
>>> Howdy Gilbert,
>>> I reproduced the issue this morning and then ran the command that you
>>> specified on two of the non-empty mailboxes.
>>> The output from that is posted here:
>>> https://gist.github.com/1000735
>>> Please let me know if this corresponds to the issue that you are seeing.
>>> Thank you,
>>> --gordon
>>>
>>> On May 27, 2011, at 20:10 , Gilbert Glåns wrote:
>>>
>>> Gordon,
>>> Could you try:
>>>
>>> erlang:process_info(list_to_pid("<0.16614.32>"), [messages,
>>> current_function, initial_call, links, memory, status]).
>>>
>>> in a riak search console for one/some of those mailboxes and share the
>>> results? I am curious to see if you are having the same systemic
>>> memory consumption I am experiencing.
>>>
>>> Gilbert
>>>
>>> On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman  wrote:
>>>
>>> Howdy Gang,
>>>
>>> We are having a bit of an issue with our 3-node riaksearch cluster.  What is
>>> happing is this:
>>>
>>> Cluster is up and running.  We start testing our application against it.  As
>>> the application runs the erlang process consumes more and more memory
>>> without ever releasing it.
>>>
>>> In trying to investigate the issue we ran the riaksearch-admin cluster_info
>>> command.  It appears that the bulk of this memory is being consumed by a
>>> bunch of mailboxes.
>>>
>>> I have posted both the output of the cluster_info command and the app.config
>>> from one of the nodes here:
>>>
>>> https://gist.github.com/996419
>>>
>>> I would be very grateful if someone from Basho would take a look at the
>>> cluster_info and see if they can spot anything obvious.
>>>
>>> Each machine in the cluster has an 8-core Xeon and 16GB RAM.  I believe all
>>> of the platform details, etc., are in the cluster_info dump.
>>>
>>> Many thanks,
>>>
>>> --gordon
>>>
>>> ___
>>>
>>> riak-users mailing list
>>>
>>> riak-users@lists.basho.com
>>>
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


deleting keys

2011-05-31 Thread Dingwell, Robert A.
Hi,

When deleting a key from a bucket I'm noticing that the object associated with 
the key is gone but the key itself is still sticking around.  I loop though all 
of the keys in a bucket and then call delete on each one, the object for the 
key is then gone so if I try to get the object for that key I get a 404 as 
expected.  But if I look at the bucket in the browser with the  keys=true 
parameter, all of the keys are still there.  Is this normal and if so how do I 
get rid of the keys?

Thanks

smime.p7s
Description: S/MIME cryptographic signature
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: deleting keys

2011-05-31 Thread Keith Bennett
Robert -

Until a source code change a few days ago, riak would by default cache the keys 
reported to be in a bucket, so after fetching them once they would not be 
updated after deletions, additions, etc.  The key is indeed gone, but the keys 
API did not report the change.

If you go to the message archive at 
http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-May/thread.html,
 and search for "Riak cleint resources", you'll see the ruckus that I started a 
week and a half ago about this very subject. ;)

There is an option to force the reloading of keys but I forget what it is, and 
anyway it is now gone from the current code base since the strategy was 
changed.  Be warned that using the keys method is, anyway, as Sean Cribbs 
pointed out to me, in general an awful idea, and almost always should be 
avoided.  This is because it's a very expensive operation -- in order to 
accomplish it, all keys in the data store need to be accessed.

My guess is that testing for the exception you encountered is probably the best 
way to test for existence/absence of a key, but hopefully those more 
knowledgeable than I will enlighten us on that.

- Keith

On May 31, 2011, at 7:23 PM, Dingwell, Robert A. wrote:

> Hi,
> 
> When deleting a key from a bucket I'm noticing that the object associated 
> with the key is gone but the key itself is still sticking around.  I loop 
> though all of the keys in a bucket and then call delete on each one, the 
> object for the key is then gone so if I try to get the object for that key I 
> get a 404 as expected.  But if I look at the bucket in the browser with the  
> keys=true parameter, all of the keys are still there.  Is this normal and if 
> so how do I get rid of the keys?
> 
> Thanks___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: deleting keys

2011-05-31 Thread Sean Cribbs
Robert,

What Keith said is misleading -- that key cache was solely in the Ruby client 
driver and not part of Riak itself.

In Riak, deletes have two phases; in the first, so-called "tombstones" are 
written to the partitions that own replicas of the key.  The tombstone has 
special metadata marking it as such and an empty value, but has a descendant 
vector clock from the last known value. In the second phase, the tombstones are 
read back from the replicas, and iff they all are tombstones (that is, all 
replicas respond, and all are tombstones), a reaping command is sent such that 
they will be cleared from the backend.

In your case, what may have occurred is that the replica chosen for key-listing 
did not receive the tombstone write (only 1/n_val of all partitions are 
consulted for key-lists), or had not yet received the reaping command. When you 
read the key again, you obviously get a "not found" because the other replicas 
will resolve to a tombstone. Eventually your read requests will invoke 
read-repair, updating the stale partition and causing the value to be reaped.

The moral of the story here is, again, don't rely on key-listings for strong 
indications of cluster state.

Sean Cribbs 
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On May 31, 2011, at 8:12 PM, Keith Bennett wrote:

> Robert -
> 
> Until a source code change a few days ago, riak would by default cache the 
> keys reported to be in a bucket, so after fetching them once they would not 
> be updated after deletions, additions, etc.  The key is indeed gone, but the 
> keys API did not report the change.
> 
> If you go to the message archive at 
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-May/thread.html,
>  and search for "Riak cleint resources", you'll see the ruckus that I started 
> a week and a half ago about this very subject. ;)
> 
> There is an option to force the reloading of keys but I forget what it is, 
> and anyway it is now gone from the current code base since the strategy was 
> changed.  Be warned that using the keys method is, anyway, as Sean Cribbs 
> pointed out to me, in general an awful idea, and almost always should be 
> avoided.  This is because it's a very expensive operation -- in order to 
> accomplish it, all keys in the data store need to be accessed.
> 
> My guess is that testing for the exception you encountered is probably the 
> best way to test for existence/absence of a key, but hopefully those more 
> knowledgeable than I will enlighten us on that.
> 
> - Keith
> 
> On May 31, 2011, at 7:23 PM, Dingwell, Robert A. wrote:
> 
>> Hi,
>> 
>> When deleting a key from a bucket I'm noticing that the object associated 
>> with the key is gone but the key itself is still sticking around.  I loop 
>> though all of the keys in a bucket and then call delete on each one, the 
>> object for the key is then gone so if I try to get the object for that key I 
>> get a 404 as expected.  But if I look at the bucket in the browser with the  
>> keys=true parameter, all of the keys are still there.  Is this normal and if 
>> so how do I get rid of the keys?
>> 
>> Thanks___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com