Andrew,
There is no tuning necessary for read-repairs. They happen automatically, but
only when you read the object. So if your access pattern involves frequent read
access to your data the repairs will just happen.
The other option is active anti-entropy. This is basically a way for Riak to
repair data in the background without relying on access pattern. It is
configured in the Riak app.config and you can find more info about the
configuration options under the 'riak_kv settings' section here:
http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/.
Hope that helps.
Kelly
On November 14, 2013 at 9:57:12 AM, Andrew Tynefield (atynefi...@gmail.com)
wrote:
Thanks again, Kelly, for your response.
I continued playing with things last night and updated
fold_objects_for_list_keys across the nodes to be true and performed a full
roll of the stack, that seems to have sped up the replies. But, the lol bucket
was still returning differing results.
I added the second node to my nginx config and have it load balancing between
the two nodes now, same results. However, I just performed 5-10 recursive get's
on the lol bucket and it seems that did the trick. Getting consistent values in
response to the ls:
(10:51:14) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:39) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:40) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:41) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:42) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:43) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
I was wondering how I can tune against issues like this, but keep the same
performance that's currently in place. I'm not very familiar with performing
read-repairs. Look forward to any assistance with this.
Thanks,
Andrew
On Thu, Nov 14, 2013 at 10:37 AM, Kelly McLaughlin <ke...@basho.com> wrote:
Andrew,
Are you able to successfully download all the files in the lol bucket? It seems
like you have some replicas in Riak that do not have copies of all of the data
from that bucket. That can be resolved in two ways: read-repair or active
anti-entropy. So I would expect doing a read of each object would resolve the
issue with differences in bucket listing attempts.
Another thing I would recommend since you are on the Riak 1.4 series is to set
fold_objects_for_list_keys to true in your Riak CS app.config. It enables
improvements to bucket listing that rely on features in Riak 1.4 and should
give you better results.
I am not certain about the upload issue with the proxy. It does seem like a
proxy issue since the upload succeeds when performed directly against the node,
but I am not familiar enough with nginx configuration to spot any issues.
Cheers,
Kelly
On November 13, 2013 at 10:24:42 PM, Andrew Tynefield (atynefi...@gmail.com)
wrote:
I appreciate the help Kelly! ( And sorry for the double mail you're going to
get, accidentally didn't reply to all. ) I've provided the requested
information below.
app.config:
http://pastebin.centos.org/5716/
app.config for riak-cs is managed by puppet, installing the same file on both
nodes.
riak-admin data:
(10:58:46) [riak] ~ $ riak-admin ring-status
================================== Claimant ===================================
Claimant: 'r...@riak.tyne.io'
Status: up
Ring Ready: true
============================== Ownership Handoff ==============================
No pending changes.
============================== Unreachable Nodes ==============================
All nodes are up and reachable
(10:58:54) [riak] ~ $ riak-admin member-status
================================= Membership ==================================
Status Ring Pending Node
-------------------------------------------------------------------------------
valid 50.0% -- 'r...@riak.tyne.io'
valid 50.0% -- 'r...@riak1.tyne.io'
-------------------------------------------------------------------------------
Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
network data:
(11:05:51) [riak] ~ $ ip a | grep /24
inet 192.168.122.90/24 brd 192.168.122.255 scope global eth0
inet 192.168.1.19/24 brd 192.168.1.255 scope global eth1
(10:59:43) [riak] ~ $ netstat -tunap | grep :8080
tcp 0 0 0.0.0.0:8080 0.0.0.0:*
LISTEN 29437/beam.smp
(11:03:20) [riak] ~ $ ps auxf | grep 29437
root 1507 0.0 0.0 103236 820 pts/4 S+ 23:03 0:00 \_
grep 29437
riakcs 29437 0.9 1.8 768480 35364 pts/3 Ssl+ 17:23 3:08 \_
/usr/lib64/riak-cs/erts-5.9.1/bin/beam.smp -K true -A 64 -W w -- -root
/usr/lib64/riak-cs -progname riak-cs -- -home /var/lib/riak-cs-control -- -boot
/usr/lib64/riak-cs/releases/1.4.2/riak-cs -config /etc/riak-cs/app.config -pa
/usr/lib64/riak-cs/lib/basho-patches -name riak...@riak.domain.com -setcookie
[redacted] -- console
nginx config: [added the proxy_pass_header to ensure I was reaching riak-cs]
upstream riak-cs {
server 192.168.1.19:8080;
}
server {
listen 80;
server_name cs.domain.com *.cs.domain.com;
location / {
proxy_pass http://riak-cs;
proxy_set_header Host $host;
proxy_connect_timeout 59s;
proxy_send_timeout 600;
proxy_read_timeout 600;
#proxy_buffering off;
proxy_buffers 16 32k;
proxy_buffer_size 64k;
proxy_pass_header Server;
#return 403;
}
}
(11:10:36) [andrew/desktop] ~ $ curl -I cs.domain.com/buckets
HTTP/1.1 404 Object Not Found
Date: Thu, 14 Nov 2013 05:10:38 GMT
Content-Type: application/xml
Connection: keep-alive
Server: Riak CS
Content-Length: 185
Please let me know if there's anything else I can provide, I'm more than
willing to do so. Also, it may be worthy to note that domain.com in this case
is an actual registered and resolving domain that has been sed'd out, cause
archive.
Thank you so much,
Andrew
On Wed, Nov 13, 2013 at 10:57 PM, Kelly McLaughlin <ke...@basho.com> wrote:
Andy,
To try to get a better idea of what might be going on it would be helpful to
see what your riak and riak cs app.config files look like. Also the output of
riak-admin ring-status and riak-admin member-status could be useful. For the
upload issue I am curious if you have changed the port that riak cs is
listening on? The default is 8080 and I don't see from your nginx config where
you are sending requests to that port.
Kelly
On November 13, 2013 at 6:49:59 PM, Andrew Tynefield (atynefi...@gmail.com)
wrote:
Hey guys,
I'm working on a dev environment for a riak-cs setup.
2 vms and an external proxy
Config of the riak/riak-cs nodes appears to be all complete. I'm encountering
two issues I'd like some pointers on where to begin diagnosing before I go
around stracing everything.
Firstly:
When using s3cmd to query riak-cs, I'm receiving differing results on the same
commands in succession. Here are the results when going through a proxy:
(07:06:09) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:10) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:11) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:12) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
(07:06:13) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:14) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
And here they are querying one of the nodes directly:
(07:05:59) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
(07:06:00) [andrew/desktop] ~ $ s3cmd ls s3://lol
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:01) [andrew/desktop] ~ $ s3cmd ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:02) [andrew/desktop] ~ $ s3cmd ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:02) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
(07:06:03) [andrew/desktop] ~ $ s3cmd ls s3://lol
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(07:06:04) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
The same results happen regardless of which node I query directly, within 1-2
seconds of executing the command a repeat execution of it returns different
results. (They are the same repetitive results, just missing objects on some of
the returns)
The other issue I'm encountering is with put's. If I put directly to the node,
I see something like:
(07:09:09) [andrew/desktop] ~ $ s3cmd put
Downloads/CentOS-6.4-x86_64-minimal.iso s3://big
Downloads/CentOS-6.4-x86_64-minimal.iso ->
s3://big/CentOS-6.4-x86_64-minimal.iso [part 1 of 23, 15MB]
15728640 of 15728640 100% in 5s 2.62 MB/s done
Downloads/CentOS-6.4-x86_64-minimal.iso ->
s3://big/CentOS-6.4-x86_64-minimal.iso [part 2 of 23, 15MB]
15728640 of 15728640 100% in 5s 2.86 MB/s done
(... Truncated some of the values for brevity ...)
Downloads/CentOS-6.4-x86_64-minimal.iso ->
s3://big/CentOS-6.4-x86_64-minimal.iso [part 22 of 23, 15MB]
15728640 of 15728640 100% in 1s 12.06 MB/s done
Downloads/CentOS-6.4-x86_64-minimal.iso ->
s3://big/CentOS-6.4-x86_64-minimal.iso [part 23 of 23, 12MB]
12929024 of 12929024 100% in 1s 11.70 MB/s done
Which is ideally what should occur. However, when I go through the proxy:
It starts great for the first chunk, but hangs:
Start:
Downloads/CentOS-6.4-x86_64-minimal.iso -> s3://big/cent6.minimal.iso [part 19
of 23, 15MB]
8675328 of 15728640 55% in 1s 8.26 MB/s
Finish:
Downloads/CentOS-6.4-x86_64-minimal.iso -> s3://big/cent6.minimal.iso [part 19
of 23, 15MB]
15728640 of 15728640 100% in 22s 683.57 kB/s done
It immediately jumps to 55% (the % varies) and then pauses, sometimes up to 30
seconds and then jumps to [done].
I assume this is in my nginx configuration somewhere, I thought it was a proxy
buffer issue, I've since raised those limits and also tried disabling
proxy_buffering entirely to no difference.
server {
listen 80;
server_name cs.domain.com *.cs.domain.com;
location / {
proxy_pass http://riak-cs;
proxy_set_header Host $host;
proxy_connect_timeout 59s;
proxy_send_timeout 600;
proxy_read_timeout 600;
#proxy_buffering off;
proxy_buffers 16 32k;
proxy_buffer_size 64k;
#return 403;
}
}
(The two nodes are identical in versions)
(07:34:47) [riak] ~ $ cat /etc/redhat-release
CentOS release 6.4 (Final)
(07:45:53) [riak] ~ $ uname -a
Linux riak.tyne.io 2.6.32-358.23.2.el6.x86_64 #1 SMP Wed Oct 16 18:37:12 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
(07:46:09) [riak] ~ $ riak version
1.4.2
(07:46:13) [riak] ~ $ riak-cs version
1.4.2
(07:46:25) [riak] ~ $ rpm -qa | grep riak
riak-cs-1.4.2-1.el6.x86_64
riak-1.4.2-1.el6.x86_64
All recommended sysctl and ulimit values have been set as described in the docs.
I look forward to any assistance with further tracking this down.
--
[Andy Tynefield]
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
[Andy Tynefield]
--
[Andy Tynefield]
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com