riak key sharding

2013-12-10 Thread Georgio Pandarez

I have noticed that Riak-CS can shard (that is split) large keys
automatically across nodes. I would like to achieve a similar outcome with
Riak itself. Is there any best practice to achieve this? Could a portion of
Riak-CS be used or should I just bite the bullet and use Riak-CS?

Latency is key for my application and I wanted to avoid the additional
layer Riak-CS provides.
Upgrade from 1.3.1 to 1.4.2 => high IO

2013-12-10 Thread Simon Effenberg
Hi @list,

I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after
upgrading the first node (out of 12) this node seems to do many merges.
the sst_* directories changes in size "rapidly" and the node is having
a disk utilization of 100% all the time.

I know that there is something like that:

"The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset
will initiate an automatic conversion that could pause the startup of
each node by 3 to 7 minutes. The leveldb data in "level #1" is being
adjusted such that "level #1" can operate as an overlapped data level
instead of as a sorted data level. The conversion is simply the
reduction of the number of files in "level #1" to being less than eight
via normal compaction of data from "level #1" into "level #2". This is
a one time conversion."

but it looks much more invasive than explained here or doesn't have to
do anything with the (probably seen) merges.

Is this "normal" behavior or could I do anything about it?

At the moment I'm stucked with the upgrade procedure because this high
IO load would probably lead to high response times.

Also we have a lot of data (per node ~950 GB).


Re: Upgrade from 1.3.1 to 1.4.2 => high IO

2013-12-10 Thread Simon Effenberg
Hi Matthew,

see inline..

On Tue, 10 Dec 2013 10:38:03 -0500
Matthew Von-Maszewski  wrote:

> The sad truth is that you are not the first to see this problem.  And yes, it 
> has to do with your 950GB per node dataset.  And no, nothing to do but sit 
> through it at this time.
> While I did extensive testing around upgrade times before shipping 1.4, 
> apparently there are data configurations I did not anticipate.  You are 
> likely seeing a cascade where a shift of one file from level-1 to level-2 is 
> causing a shift of another file from level-2 to level-3, which causes a 
> level-3 file to shift to level-4, etc … then the next file shifts from 
> level-1.
> The bright side of this pain is that you will end up with better write 
> throughput once all the compaction ends.

I have to deal with that.. but my problem is now, if I'm doing this
node by node it looks like 2i searches aren't possible while 1.3 and
1.4 nodes exists in the cluster. Is there any problem which leads me to
an 2i repair marathon or could I easily wait for some hours for each
node until all merges are done before I upgrade the next one? (2i
searches can fail for some time.. the APP isn't having problems with
that but are new inserts with 2i indices processed successfully or do
I have to do the 2i repair?)


one other good think: saving disk space is one advantage ;)..

> Riak 2.0's leveldb has code to prevent/reduce compaction cascades, but that 
> is not going to help you today.
> Matthew
On Dec 10, 2013, at 10:26 AM, Simon Effenberg  
> wrote:
> > Hi @list,
> > 
> > I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after
> > upgrading the first node (out of 12) this node seems to do many merges.
> > the sst_* directories changes in size "rapidly" and the node is having
> > a disk utilization of 100% all the time.
> > 
> > I know that there is something like that:
> > 
> > "The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset
> > will initiate an automatic conversion that could pause the startup of
> > each node by 3 to 7 minutes. The leveldb data in "level #1" is being
> > adjusted such that "level #1" can operate as an overlapped data level
> > instead of as a sorted data level. The conversion is simply the
> > reduction of the number of files in "level #1" to being less than eight
> > via normal compaction of data from "level #1" into "level #2". This is
> > a one time conversion."
> > 
> > but it looks much more invasive than explained here or doesn't have to
> > do anything with the (probably seen) merges.
> > 
> > Is this "normal" behavior or could I do anything about it?
> > 
> > At the moment I'm stucked with the upgrade procedure because this high
> > IO load would probably lead to high response times.
> > 
> > Also we have a lot of data (per node ~950 GB).
> > 
> > Cheers
> > Simon
> > 
Re: Stalled handoffs on a prod cluster after server crash

2013-12-10 Thread Simon Effenberg
I had something like that once but with version 1.2 or 1.3 .. a rolling
restart helped in my case.


On Mon, 9 Dec 2013 09:48:12 -0500
Ivaylo Panitchkov  wrote:

> Hello,
> We have a prod cluster of four machines running riak (1.1.4 2012-06-19)
> Debian x86_64.
> Two days ago one of the servers went down because of a hardware failure.
> I force-removed the machine in question to re-balance the cluster before
> adding the new machine.
> Since then the cluster is operating properly, but I noticed some handoffs
> are stalled now.
> I had similar situation awhile ago that was solved by simply forcing the
> handoffs, but this time the same approach didn't work.
> Any ideas, solutions or just hints are greatly appreciated.
> Below are cluster statuses. Replaced the IP addresses for security reason.
> ~# riak-admin member_status
> Attempting to restart script through sudo -u riak
> = Membership
> ==
> Status RingPendingNode
> ---
> valid  45.3% 34.4%'r...@aaa.aaa.aaa.aaa'
> valid  26.6% 32.8%'r...@bbb.bbb.bbb.bbb'
> valid  28.1% 32.8%'r...@ccc.ccc.ccc.ccc'
> ---
> Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
> ~# riak-admin ring_status
> Attempting to restart script through sudo -u riak
> == Claimant
> ===
> Claimant:  'r...@aaa.aaa.aaa.aaa'
> Status: up
> Ring Ready: true
> == Ownership Handoff
> ==
> Owner:  r...@aaa.aaa.aaa.aaa
> Next Owner: r...@bbb.bbb.bbb.bbb
> Index: 22835963083295358096932575511191922182123945984
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> Index: 570899077082383952423314387779798054553098649600
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> Index: 1118962191081472546749696200048404186924073353216
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> Index: 1392993748081016843912887106182707253109560705024
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> ---
> Owner:  r...@aaa.aaa.aaa.aaa
> Next Owner: r...@ccc.ccc.ccc.ccc
> Index: 114179815416476790484662877555959610910619729920
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> Index: 662242929415565384811044689824565743281594433536
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> Index: 1210306043414653979137426502093171875652569137152
>   Waiting on: [riak_kv_vnode]
>   Complete:   [riak_pipe_vnode]
> ---
> == Unreachable Nodes
> ==
> All nodes are up and reachable
> Thanks in advance,
> Ivaylo
Re: Upgrade from 1.3.1 to 1.4.2 => high IO

2013-12-10 Thread Matthew Von-Maszewski
The sad truth is that you are not the first to see this problem.  And yes, it 
has to do with your 950GB per node dataset.  And no, nothing to do but sit 
through it at this time.

While I did extensive testing around upgrade times before shipping 1.4, 
apparently there are data configurations I did not anticipate.  You are likely 
seeing a cascade where a shift of one file from level-1 to level-2 is causing a 
shift of another file from level-2 to level-3, which causes a level-3 file to 
shift to level-4, etc … then the next file shifts from level-1.

The bright side of this pain is that you will end up with better write 
throughput once all the compaction ends.

Riak 2.0's leveldb has code to prevent/reduce compaction cascades, but that is 
not going to help you today.


On Dec 10, 2013, at 10:26 AM, Simon Effenberg  wrote:

> Hi @list,
> I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after
> upgrading the first node (out of 12) this node seems to do many merges.
> the sst_* directories changes in size "rapidly" and the node is having
> a disk utilization of 100% all the time.
> I know that there is something like that:
> "The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset
> will initiate an automatic conversion that could pause the startup of
> each node by 3 to 7 minutes. The leveldb data in "level #1" is being
> adjusted such that "level #1" can operate as an overlapped data level
> instead of as a sorted data level. The conversion is simply the
> reduction of the number of files in "level #1" to being less than eight
> via normal compaction of data from "level #1" into "level #2". This is
> a one time conversion."
> but it looks much more invasive than explained here or doesn't have to
> do anything with the (probably seen) merges.
> Is this "normal" behavior or could I do anything about it?
> At the moment I'm stucked with the upgrade procedure because this high
> IO load would probably lead to high response times.
> Also we have a lot of data (per node ~950 GB).
> Cheers
> Simon
Re: Upgrade from 1.3.1 to 1.4.2 => high IO

2013-12-10 Thread Matthew Von-Maszewski
2i is not my expertise, so I had to discuss you concerns with another Basho 
developer.  He says:

Between 1.3 and 1.4, the 2i query did change but not the 2i on-disk format.  
You must wait for all nodes to update if you desire to use the new 2i query.  
The 2i data will properly write/update on both 1.3 and 1.4 machines during the 

Does that answer your question?

And yes, you might see available disk space increase during the upgrade 
compactions if your dataset contains numerous delete "tombstones".  The Riak 
2.0 code includes a new feature called "aggressive delete" for leveldb.  This 
feature is more proactive in pushing delete tombstones through the levels to 
free up disk space much more quickly (especially if you perform block deletes 
every now and then).


On Dec 10, 2013, at 10:44 AM, Simon Effenberg  wrote:

> Hi Matthew,
> see inline..
> On Tue, 10 Dec 2013 10:38:03 -0500
> Matthew Von-Maszewski  wrote:
>> The sad truth is that you are not the first to see this problem.  And yes, 
>> it has to do with your 950GB per node dataset.  And no, nothing to do but 
>> sit through it at this time.
>> While I did extensive testing around upgrade times before shipping 1.4, 
>> apparently there are data configurations I did not anticipate.  You are 
>> likely seeing a cascade where a shift of one file from level-1 to level-2 is 
>> causing a shift of another file from level-2 to level-3, which causes a 
>> level-3 file to shift to level-4, etc … then the next file shifts from 
>> level-1.
>> The bright side of this pain is that you will end up with better write 
>> throughput once all the compaction ends.
> I have to deal with that.. but my problem is now, if I'm doing this
> node by node it looks like 2i searches aren't possible while 1.3 and
> 1.4 nodes exists in the cluster. Is there any problem which leads me to
> an 2i repair marathon or could I easily wait for some hours for each
> node until all merges are done before I upgrade the next one? (2i
> searches can fail for some time.. the APP isn't having problems with
> that but are new inserts with 2i indices processed successfully or do
> I have to do the 2i repair?)
> /s
> one other good think: saving disk space is one advantage ;)..
>> Riak 2.0's leveldb has code to prevent/reduce compaction cascades, but that 
>> is not going to help you today.
>> Matthew
>> On Dec 10, 2013, at 10:26 AM, Simon Effenberg  
>> wrote:
>>> Hi @list,
>>> I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after
>>> upgrading the first node (out of 12) this node seems to do many merges.
>>> the sst_* directories changes in size "rapidly" and the node is having
>>> a disk utilization of 100% all the time.
>>> I know that there is something like that:
>>> "The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset
>>> will initiate an automatic conversion that could pause the startup of
>>> each node by 3 to 7 minutes. The leveldb data in "level #1" is being
>>> adjusted such that "level #1" can operate as an overlapped data level
>>> instead of as a sorted data level. The conversion is simply the
>>> reduction of the number of files in "level #1" to being less than eight
>>> via normal compaction of data from "level #1" into "level #2". This is
>>> a one time conversion."
>>> but it looks much more invasive than explained here or doesn't have to
>>> do anything with the (probably seen) merges.
>>> Is this "normal" behavior or could I do anything about it?
>>> At the moment I'm stucked with the upgrade procedure because this high
>>> IO load would probably lead to high response times.
>>> Also we have a lot of data (per node ~950 GB).
>>> Cheers
>>> Simon
Re: Upgrade from 1.3.1 to 1.4.2 => high IO

2013-12-10 Thread Simon Effenberg
Hi Matthew,

thanks!.. that answers my questions!


On Tue, 10 Dec 2013 11:08:32 -0500
Matthew Von-Maszewski  wrote:

> 2i is not my expertise, so I had to discuss you concerns with another Basho 
> developer.  He says:
> Between 1.3 and 1.4, the 2i query did change but not the 2i on-disk format.  
> You must wait for all nodes to update if you desire to use the new 2i query.  
> The 2i data will properly write/update on both 1.3 and 1.4 machines during 
> the migration.
> Does that answer your question?
> And yes, you might see available disk space increase during the upgrade 
> compactions if your dataset contains numerous delete "tombstones".  The Riak 
> 2.0 code includes a new feature called "aggressive delete" for leveldb.  This 
> feature is more proactive in pushing delete tombstones through the levels to 
> free up disk space much more quickly (especially if you perform block deletes 
> every now and then).
> Matthew
> On Dec 10, 2013, at 10:44 AM, Simon Effenberg  
> wrote:
> > Hi Matthew,
> > 
> > see inline..
> > 
> > On Tue, 10 Dec 2013 10:38:03 -0500
> > Matthew Von-Maszewski  wrote:
> > 
> >> The sad truth is that you are not the first to see this problem.  And yes, 
> >> it has to do with your 950GB per node dataset.  And no, nothing to do but 
> >> sit through it at this time.
> >> 
> >> While I did extensive testing around upgrade times before shipping 1.4, 
> >> apparently there are data configurations I did not anticipate.  You are 
> >> likely seeing a cascade where a shift of one file from level-1 to level-2 
> >> is causing a shift of another file from level-2 to level-3, which causes a 
> >> level-3 file to shift to level-4, etc … then the next file shifts from 
> >> level-1.
> >> 
> >> The bright side of this pain is that you will end up with better write 
> >> throughput once all the compaction ends.
> > 
> > I have to deal with that.. but my problem is now, if I'm doing this
> > node by node it looks like 2i searches aren't possible while 1.3 and
> > 1.4 nodes exists in the cluster. Is there any problem which leads me to
> > an 2i repair marathon or could I easily wait for some hours for each
> > node until all merges are done before I upgrade the next one? (2i
> > searches can fail for some time.. the APP isn't having problems with
> > that but are new inserts with 2i indices processed successfully or do
> > I have to do the 2i repair?)
> > 
> > /s
> > 
> > one other good think: saving disk space is one advantage ;)..
> > 
> > 
> >> 
> >> Riak 2.0's leveldb has code to prevent/reduce compaction cascades, but 
> >> that is not going to help you today.
> >> 
> >> Matthew
> >> 
> >> On Dec 10, 2013, at 10:26 AM, Simon Effenberg  
> >> wrote:
> >> 
> >>> Hi @list,
> >>> 
> >>> I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after
> >>> upgrading the first node (out of 12) this node seems to do many merges.
> >>> the sst_* directories changes in size "rapidly" and the node is having
> >>> a disk utilization of 100% all the time.
> >>> 
> >>> I know that there is something like that:
> >>> 
> >>> "The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset
> >>> will initiate an automatic conversion that could pause the startup of
> >>> each node by 3 to 7 minutes. The leveldb data in "level #1" is being
> >>> adjusted such that "level #1" can operate as an overlapped data level
> >>> instead of as a sorted data level. The conversion is simply the
> >>> reduction of the number of files in "level #1" to being less than eight
> >>> via normal compaction of data from "level #1" into "level #2". This is
> >>> a one time conversion."
> >>> 
> >>> but it looks much more invasive than explained here or doesn't have to
> >>> do anything with the (probably seen) merges.
> >>> 
> >>> Is this "normal" behavior or could I do anything about it?
> >>> 
> >>> At the moment I'm stucked with the upgrade procedure because this high
> >>> IO load would probably lead to high response times.
> >>> 
> >>> Also we have a lot of data (per node ~950 GB).
> >>> 
> >>> Cheers
> >>> Simon
> >>> 
> >> 
> > 
> > 
Simon Effenberg | Site Ops Engineer | mobile.international GmbH
Fon: + 49-(0)30-8109 - 7173
Fax: + 49-(0)30-8109 - 7131

Mail: seffenb...@team.mobile.de

Marktplatz 1 | 14532 Europarc Dreilinden | Germany


2i stopped working on LevelDB with multi backend

2013-12-10 Thread Chris Read
We just rebuilt our test environment (something we do every month) and
suddenly we get the following error when trying to use 2i:


But looking at the properties of the bucket it's set to use leveldb:

# curl -k https://localhost:8069/riak/eleveldb/ | jq .
  "props": {
"young_vclock": 20,
"w": "quorum",
"small_vclock": 50,
"rw": "quorum",
"r": "quorum",
"linkfun": {
  "fun": "mapreduce_linkfun",
  "mod": "riak_kv_wm_link_walker"
"last_write_wins": false,
"dw": "quorum",
"chash_keyfun": {
  "fun": "chash_std_keyfun",
  "mod": "riak_core_util"
"big_vclock": 50,
"basic_quorum": false,
"backend": "eleveldb_data",
"allow_mult": false,
"n_val": 3,
"name": "eleveldb",
"notfound_ok": true,
"old_vclock": 86400,
"postcommit": [],
"pr": 0,
"precommit": [],
"pw": 0

Here's the relevant app.config snippet:

{storage_backend, riak_kv_multi_backend},
{multi_backend_default, <<"bitcask_data">>},
{multi_backend, [
  {<<"bitcask_data">>, riak_kv_bitcask_backend,
 {data_root, "/srv/riak/data/bitcask/data"},
 %%{io_mode, nif},
 {max_file_size, 2147483648}, %% 2G
 {merge_window, always},
 {frag_merge_trigger, 30},  %% Merge
at 30% dead keys
 {dead_bytes_merge_trigger, 134217728}, %% Merge
files that have more than 128MB dead
 {frag_threshold, 25},  %% Files
that have 25% dead keys will be merged too
 {dead_bytes_threshold, 67108864},  %% Include
files that have 64MB of dead space in merges
 {small_file_threshold, 10485760},  %% Files
smaller than 100MB will not be merged
 {log_needs_merge, true},   %% Log
when we need to merge...
 {sync_strategy, none}

  {<<"eleveldb_data">>, riak_kv_eleveldb_backend,
[{data_root, "/srv/riak/data/eleveldb/files"},
 {write_buffer_size_min, 31457280 }, %% 30 MB in bytes
 {write_buffer_size_max, 62914560}, %% 60 MB in bytes
 {max_open_files, 20}, %% Maximum number of files open
at once per partition
 {sst_block_size, 4096}, %% 4K blocks
 {cache_size, 8388608} %% 8MB default cache size per-partition


Anyone have any ideas?

We're using Ubuntu 12.04 with the Basho Riak 1.4.2 .deb. The only
change to this environment has been to upgrade the kernel from
3.5.0-26 to 3.8.0-31-generic but I'd be very surprised if that broke



Re: riak nagios script

2013-12-10 Thread Hector Castro
Hello Kathleen,

Have you executed the `make encrypt` target to build the `check_node`
binary? [0] From there, I copied it to the Riak node and invoked it
like this:

$ /usr/lib/riak/erts-5.9.1/bin/escript check_node --node
riak@ riak_kv_up
OKAY: riak_kv is running on riak@

I used the entire path to escript because the bin directory under erts
was not in my PATH by default.


[0] https://github.com/basho/riak_nagios#building

On Mon, Dec 9, 2013 at 7:35 PM, kzhang  wrote:
> Also, when running
> https://github.com/basho/riak_nagios/blob/master/src/check_node.erl
> I ran into the error:
> ** exception error: undefined function getopt:parse/2
>  in function  check_node:main/2 (check_node.erl, line 15)
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/riak-nagios-script-tp4030025p4030026.html
> Sent from the Riak Users mailing list archive at Nabble.com.
Re: riak nagios script

2013-12-10 Thread kzhang
Thanks Hector.

Here is how I executed the script.

 I downloaded and installed the erlang shell from 

started erlang OTP:

root@MYRIAKNODE otp_src_R16B02]# erl -s toolbar
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [async-threads:10] [hipe]

Eshell V5.10.3  (abort with ^G)

grabbed the source code
compiled it: 

ran it:
check_node:main([{node, 'xx.xx.xx.xx'}]).   

then got:

** exception error: undefined function getopt:parse/2
 in function  check_node:main/2 (check_node.erl, line 15)

Here is where I am. I found this:


I grabbed the source code, compiled it under otp_src_R16B02.

ran it again:
2> check_node:main([{node, 'xx.xx.xx.xx'}]).
UNKNOWN: invalid_option_arg {check,{node,'xx.xx.xx.xx'}}

Am I on the right path?



Riak Recap for December 4 - 9

2013-12-10 Thread Mark Phillips
Morning, Afternoon, Evening to All -

Here's today's Recap. Enjoy.

Also, if you're around Raleigh/Durham and want to have drinks next
week, let me know.


Riak Recap for December 4 - 9

The recording of last Friday's Riak Community Hangout is now
available. This one is all about Riak Security and the exciting
history behind "allow_mult=false". It's well worth your time.
- http://www.youtube.com/watch?v=n8m8xlizekg

John Daily et al. are talking about Riak 2.0 tomorrow night at the
Braintree offices in Chicago. This is not to be missed.
- www.meetup.com/Chicago-Riak-Meetup/events/151516252/

Tom Santero and I will be at the West End Ruby Meetup next week in
Durham, NC to talk about Riak.
- http://www.meetup.com/raleighrb/events/154001722/

Riakpbc, nlf's Node.js protocol buffers client for Riak hit version
1.0.5. (Also, h/t to nlf for cranking out bug fixes.)
- https://npmjs.org/package/riakpbc

Riaks, Noah Isaacson's Rak client, just hit 2.0.2
- https://npmjs.org/package/riaks

We wrote up some details on how the team at CityMaps is using Riak in

Vincent Chinedu Okonkwo open sourced a Lager backend for Mozilla’s Heka.
- https://github.com/codmajik/herlka

Vic Iglesias wrote a great post about getting Riak CS and Eucalyptus
running together.

Q & A
- http://stackoverflow.com/questions/20366695/truncate-a-riak-database
- http://stackoverflow.com/questions/20440450/riak-databse-and-spring-mvc

Re: riak nagios script

2013-12-10 Thread Alex Moore
Hi Kathleen,

If you’d like to run riak_nagios from the erl command line, you’ll need to 
compile everything in src and include it in the path along with the getopt 

You can compile everything with a simple call to make, and then include it in 
the path with "erl -pa deps/*/ebin ebin”.  
Once everything is loaded, you can call "check_node:main(["--node", 
"dev1@", "riak_kv_up"]).” or something similar to run it.  The last 
parameter in the Args array will be the check to make.  

Is there a reason you’re running it this way instead of compiling it to an 
escript and running it from bash? 

Alex Moore

On December 10, 2013 at 1:26:20 PM, kzhang (kzh...@wayfair.com) wrote:

Thanks Hector.  

Here is how I executed the script.  

I downloaded and installed the erlang shell from  

started erlang OTP:  

root@MYRIAKNODE otp_src_R16B02]# erl -s toolbar  
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [async-threads:10] [hipe]  

Eshell V5.10.3 (abort with ^G)  

grabbed the source code  
compiled it:  

ran it:  
check_node:main([{node, 'xx.xx.xx.xx'}]).   

then got:  

** exception error: undefined function getopt:parse/2  
in function check_node:main/2 (check_node.erl, line 15)  

Here is where I am. I found this:  


I grabbed the source code, compiled it under otp_src_R16B02.  

ran it again:  
2> check_node:main([{node, 'xx.xx.xx.xx'}]).  
UNKNOWN: invalid_option_arg {check,{node,'xx.xx.xx.xx'}}  

Am I on the right path?  



Re: Stalled handoffs on a prod cluster after server crash

2013-12-10 Thread Jeppe Toustrup
What does "riak-admin transfers" tell you? Are there any transfers in progress?
You can try to set the amount of allowed transfers per host to 0 and
then back to 2 (the default) or whatever you want, in order to restart
any transfers which may be in progress. You can do that with the
"riak-admin transfer-limit " command.

Jeppe Fihl Toustrup
Operations Engineer
Falcon Social

On 9 December 2013 15:48, Ivaylo Panitchkov  wrote:
> Hello,
> We have a prod cluster of four machines running riak (1.1.4 2012-06-19) 
> Debian x86_64.
> Two days ago one of the servers went down because of a hardware failure.
> I force-removed the machine in question to re-balance the cluster before 
> adding the new machine.
> Since then the cluster is operating properly, but I noticed some handoffs are 
> stalled now.
> I had similar situation awhile ago that was solved by simply forcing the 
> handoffs, but this time the same approach didn't work.
> Any ideas, solutions or just hints are greatly appreciated.

Re: Stalled handoffs on a prod cluster after server crash

2013-12-10 Thread Ivaylo Panitchkov
Below is the transfers info:

~# riak-admin transfers
Attempting to restart script through sudo -u riak
'r...@ccc.ccc.ccc.ccc' waiting to handoff 7 partitions
'r...@bbb.bbb.bbb.bbb' waiting to handoff 7 partitions
'r...@aaa.aaa.aaa.aaa' waiting to handoff 5 partitions

~# riak-admin member_status
Attempting to restart script through sudo -u riak
= Membership
Status RingPendingNode
valid  45.3% 34.4%'r...@aaa.aaa.aaa.aaa'
valid  26.6% 32.8%'r...@bbb.bbb.bbb.bbb'
valid  28.1% 32.8%'r...@ccc.ccc.ccc.ccc'

It's stuck with all those handoffs for few days now.
riak-admin ring_status gives me the same info like the one I mentioned when
opened the case.
I noticed AAA.AAA.AAA.AAA experience more load than other servers as it's
responsible for almost half of the data.
Is it safe to add another machine to the cluster in order to relief
AAA.AAA.AAA.AAA even when the issue with handoffs is not yet resolved?


On Tue, Dec 10, 2013 at 3:04 PM, Jeppe Toustrup wrote:

> What does "riak-admin transfers" tell you? Are there any transfers in
> progress?
> You can try to set the amount of allowed transfers per host to 0 and
> then back to 2 (the default) or whatever you want, in order to restart
> any transfers which may be in progress. You can do that with the
> "riak-admin transfer-limit " command.
Re: Stalled handoffs on a prod cluster after server crash

2013-12-10 Thread Jeppe Toustrup
Try to take a look at this thread from November where I experienced a
similar problem:

The following mails in the thread mentions things you try to correct
the problem, and what I ended up doing with the help of Basho

Jeppe Fihl Toustrup
Operations Engineer
Falcon Social

On 10 December 2013 22:03, Ivaylo Panitchkov  wrote:
> Hello,
> Below is the transfers info:
> ~# riak-admin transfers
> Attempting to restart script through sudo -u riak
> 'r...@ccc.ccc.ccc.ccc' waiting to handoff 7 partitions
> 'r...@bbb.bbb.bbb.bbb' waiting to handoff 7 partitions
> 'r...@aaa.aaa.aaa.aaa' waiting to handoff 5 partitions
> ~# riak-admin member_status
> Attempting to restart script through sudo -u riak
> = Membership
> ==
> Status RingPendingNode
> ---
> valid  45.3% 34.4%'r...@aaa.aaa.aaa.aaa'
> valid  26.6% 32.8%'r...@bbb.bbb.bbb.bbb'
> valid  28.1% 32.8%'r...@ccc.ccc.ccc.ccc'
> ---
> It's stuck with all those handoffs for few days now.
> riak-admin ring_status gives me the same info like the one I mentioned when
> opened the case.
> I noticed AAA.AAA.AAA.AAA experience more load than other servers as it's
> responsible for almost half of the data.
> Is it safe to add another machine to the cluster in order to relief
> AAA.AAA.AAA.AAA even when the issue with handoffs is not yet resolved?
> Thanks,
> Ivaylo

Re: riak nagios script

2013-12-10 Thread kzhang
Hi Alex,


I am completely new to erlang. When googling how to run an erlang program, I
came across
. That's how I got started. 

To run the script using escript, based on
http://www.erlang.org/doc/man/escript.html, looks like I dont need to
compile the scripts, so I ran:

/usr/local/bin/escript check_node --node riak@ check_riak_repl

escript: Failed to open file: check_node

Re: Stalled handoffs on a prod cluster after server crash

2013-12-10 Thread Mark Phillips
Hi Ivaylo,

Is there anything useful in console.log of any (or all) the nodes? If
so, throw it in a gist and we'll take a look at it.


On Tue, Dec 10, 2013 at 1:13 PM, Jeppe Toustrup  wrote:
> Try to take a look at this thread from November where I experienced a
> similar problem:
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-November/014027.html
> The following mails in the thread mentions things you try to correct
> the problem, and what I ended up doing with the help of Basho
> employees.
> --
> Jeppe Fihl Toustrup
> Operations Engineer
> Falcon Social
> On 10 December 2013 22:03, Ivaylo Panitchkov  wrote:
>> Hello,
>> Below is the transfers info:
>> ~# riak-admin transfers
>> Attempting to restart script through sudo -u riak
>> 'r...@ccc.ccc.ccc.ccc' waiting to handoff 7 partitions
>> 'r...@bbb.bbb.bbb.bbb' waiting to handoff 7 partitions
>> 'r...@aaa.aaa.aaa.aaa' waiting to handoff 5 partitions
>> ~# riak-admin member_status
>> Attempting to restart script through sudo -u riak
>> = Membership
>> ==
>> Status RingPendingNode
>> ---
>> valid  45.3% 34.4%'r...@aaa.aaa.aaa.aaa'
>> valid  26.6% 32.8%'r...@bbb.bbb.bbb.bbb'
>> valid  28.1% 32.8%'r...@ccc.ccc.ccc.ccc'
>> ---
>> It's stuck with all those handoffs for few days now.
>> riak-admin ring_status gives me the same info like the one I mentioned when
>> opened the case.
>> I noticed AAA.AAA.AAA.AAA experience more load than other servers as it's
>> responsible for almost half of the data.
>> Is it safe to add another machine to the cluster in order to relief
>> AAA.AAA.AAA.AAA even when the issue with handoffs is not yet resolved?
>> Thanks,
>> Ivaylo
