Some questions about failures during a PUT
Hi, I've started to look through the Riak sources, and I've been wondering how the system behaves in certain failure scenarios. In particular, it seems to me that it's quite easy to get into a state where the client thinks a PUT request failed, but the object was in fact written to storage and will be replicated to n_val nodes eventually. For example, assume I have a cluster with three nodes, A, B, C. I use r=pr=w=pw=dw=quorum=2 and an n_val of 3. All nodes have a copy of object O with key K and vector clock V. A client reads O, modifies it locally, then attempts to write its modified copy O'. Let's say the client talks to node A. riak_client:put/2 will cause a riak_kv_put_fsm to be started. In execute_local/1, we tell the local vnode to coordinate this PUT, which will cause it to eventually call riak_kv_vnode:prepare_put/2, which increments the object's vector clock to V', and perform_put/3, which writes O' to disk with vector clock V'. The vnode replies {dw, ...} to the FSM which has been idling in waiting_local_node/2. If A now crashes, the client will get an {error, timeout}. If A then recovers and the client does a new GET for key K it may see either O or O' (if the vnode on A is one of the first two vnodes to return a repsonse, its version of the object will be used as the result since V descends from V'). Is this actually possible? Also, how would read-repair work in this scenario? If all three nodes are up, I guess Riak could detect the incosistency, but if e.g. C was down at the time, would K' be replicated to B and then later to C as well? cheers, -jakob ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: http new endpoints
Hi Jordan, On Wed, Dec 28, 2011 at 12:51 PM, Jordan Schatz wrote: > > I am working on bindings for the HTTP API for lisp, the docs list new and > old URLs for requests, but it appears that not all of the new URLs are > implemented yet? I am using 1.0.2 is there a planned release number for > supporting all of the new URL formats? > Looks like you've caught an error in the wiki. The new URL formats are supported, but we neglected to provide the full URL in the docs. You are missing a "/keys" from the end of those URLs. The URLs in line below should work for you. I'll push a fix to the wiki today if I have a few minutes. Sorry for the confusion. Also, quite excited to see lisp bindings. :) Mark > These don't seem to be working with the new URL format: > > Get bucket: > curl -v http://127.0.0.1:8098/buckets/test > curl -v http://127.0.0.1:8098/buckets/test/keys > Set bucket: > curl -v -X PUT -H "Content-Type: application/json" -d \ > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ http://127.0.0.1:8098/buckets/test/keys > Store object: > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ > http://127.0.0.1:8098/buckets/test > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ http://127.0.0.1:8098/buckets/test/keys ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Best practices for using the PB client
Hey all, I'm looking for some best practices in handling connections when using the protocol buffer client. Specifically, I have 3 nodes in my cluster, and need to figure out how to handle the situation when one of the nodes is down. I'm currently using a pooler app (https://github.com/seth/pooler) and this helps me distribute the load to all of the nodes, but when one goes down, the app doesn't recover nicely. I'm about to write some code in my app to handle this, but before I do, I thought I'd check for existing solutions and best practices: - Is there an existing connection pooling mechanism that someone has created which handles node failures automatically? If not, then I'm looking forward to writing it! Thank in advance, Marc___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: http new endpoints
Thank you Mark : ) BTW > > Set bucket: > > curl -v -X PUT -H "Content-Type: application/json" -d \ > > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test > > > > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ > http://127.0.0.1:8098/buckets/test/keys This is needed instead: curl -v -X PUT -H "Content-Type: application/json" -d \ '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test/props > Also, quite excited to see lisp bindings. :) I noticed yesterday that you guys already list the bindings I made (and am updating): http://wiki.basho.com/Community-Developed-Libraries-and-Projects.html Mine are the Racket (which is a lisp dialect) ones. Thanks, Jordan On Fri, Dec 30, 2011 at 12:41:50PM -0500, Mark Phillips wrote: > Hi Jordan, > > On Wed, Dec 28, 2011 at 12:51 PM, Jordan Schatz wrote: > > > > I am working on bindings for the HTTP API for lisp, the docs list new and > > old URLs for requests, but it appears that not all of the new URLs are > > implemented yet? I am using 1.0.2 is there a planned release number for > > supporting all of the new URL formats? > > > > Looks like you've caught an error in the wiki. The new URL formats are > supported, but we neglected to provide the full URL in the docs. You > are missing a "/keys" from the end of those URLs. The URLs in line > below should work for you. > > I'll push a fix to the wiki today if I have a few minutes. Sorry for > the confusion. Also, quite excited to see lisp bindings. :) > > Mark > > > These don't seem to be working with the new URL format: > > > > Get bucket: > > curl -v http://127.0.0.1:8098/buckets/test > > > > curl -v http://127.0.0.1:8098/buckets/test/keys > > > Set bucket: > > curl -v -X PUT -H "Content-Type: application/json" -d \ > > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test > > > > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ > http://127.0.0.1:8098/buckets/test/keys > > > > Store object: > > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ > > http://127.0.0.1:8098/buckets/test > > > > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ > http://127.0.0.1:8098/buckets/test/keys ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: http new endpoints
On Fri, Dec 30, 2011 at 1:10 PM, Jordan Schatz wrote: > Thank you Mark : ) > > BTW >> > Set bucket: >> > curl -v -X PUT -H "Content-Type: application/json" -d \ >> > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test >> > >> >> curl -v -d 'this is a test' -H "Content-Type: text/plain" \ >> http://127.0.0.1:8098/buckets/test/keys > > This is needed instead: > curl -v -X PUT -H "Content-Type: application/json" -d \ > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test/props > Of course. Just augmented the wiki PR accordingly. Thanks. https://github.com/basho/riak_wiki/pull/237 >> Also, quite excited to see lisp bindings. :) > I noticed yesterday that you guys already list the bindings I made (and > am updating): > http://wiki.basho.com/Community-Developed-Libraries-and-Projects.html > Mine are the Racket (which is a lisp dialect) ones. Excellent. Still excited regardless. Mark > Thanks, > Jordan > > > On Fri, Dec 30, 2011 at 12:41:50PM -0500, Mark Phillips wrote: >> Hi Jordan, >> >> On Wed, Dec 28, 2011 at 12:51 PM, Jordan Schatz wrote: >> > >> > I am working on bindings for the HTTP API for lisp, the docs list new and >> > old URLs for requests, but it appears that not all of the new URLs are >> > implemented yet? I am using 1.0.2 is there a planned release number for >> > supporting all of the new URL formats? >> > >> >> Looks like you've caught an error in the wiki. The new URL formats are >> supported, but we neglected to provide the full URL in the docs. You >> are missing a "/keys" from the end of those URLs. The URLs in line >> below should work for you. >> >> I'll push a fix to the wiki today if I have a few minutes. Sorry for >> the confusion. Also, quite excited to see lisp bindings. :) >> >> Mark >> >> > These don't seem to be working with the new URL format: >> > >> > Get bucket: >> > curl -v http://127.0.0.1:8098/buckets/test >> > >> >> curl -v http://127.0.0.1:8098/buckets/test/keys >> >> > Set bucket: >> > curl -v -X PUT -H "Content-Type: application/json" -d \ >> > '{"props":{"n_val":5}}' http://127.0.0.1:8098/buckets/test >> > >> >> curl -v -d 'this is a test' -H "Content-Type: text/plain" \ >> http://127.0.0.1:8098/buckets/test/keys >> >> >> > Store object: >> > curl -v -d 'this is a test' -H "Content-Type: text/plain" \ >> > http://127.0.0.1:8098/buckets/test >> > >> >> curl -v -d 'this is a test' -H "Content-Type: text/plain" \ >> http://127.0.0.1:8098/buckets/test/keys ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Best practices for using the PB client
You should look into using HAProxy in front of your nodes. Let HAProxy load balance between all your nodes and then if one goes down, HAProxy just pulls it out of the load balancing cluster automatically until it is restored. Then your pooler can just pool connections from HAProxy instead so it doesn't have to worry at all about failed nodes. Also, shameless plug, I have a pooler as well which has a few more options than pooler. You can check it out here: https://github.com/aberman/pooly --Andrew On Fri, Dec 30, 2011 at 9:58 AM, Marc Campbell wrote: > Hey all, > > I'm looking for some best practices in handling connections when using the > protocol buffer client. Specifically, I have 3 nodes in my cluster, and > need to figure out how to handle the situation when one of the nodes is > down. > > I'm currently using a pooler app (https://github.com/seth/pooler) and > this helps me distribute the load to all of the nodes, but when one goes > down, the app doesn't recover nicely. > > I'm about to write some code in my app to handle this, but before I do, I > thought I'd check for existing solutions and best practices: > > - Is there an existing connection pooling mechanism that someone has > created which handles node failures automatically? > > If not, then I'm looking forward to writing it! > > Thank in advance, > Marc > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Best practices for using the PB client
Great, thanks for the feedback. I'll check out pooly, for sure. I was thinking about using HAProxy/Zeus (I'm currently using Riak Smartmachines @ Joyent). I really like this idea, the logic for node failures shouldn't be in my code. I'll give this a try! Thanks, Marc On Dec 30, 2011, at 11:31 AM, Andrew Berman wrote: > You should look into using HAProxy in front of your nodes. Let HAProxy load > balance between all your nodes and then if one goes down, HAProxy just pulls > it out of the load balancing cluster automatically until it is restored. > Then your pooler can just pool connections from HAProxy instead so it doesn't > have to worry at all about failed nodes. > > Also, shameless plug, I have a pooler as well which has a few more options > than pooler. You can check it out here: https://github.com/aberman/pooly > > --Andrew > > On Fri, Dec 30, 2011 at 9:58 AM, Marc Campbell wrote: > Hey all, > > I'm looking for some best practices in handling connections when using the > protocol buffer client. Specifically, I have 3 nodes in my cluster, and need > to figure out how to handle the situation when one of the nodes is down. > > I'm currently using a pooler app (https://github.com/seth/pooler) and this > helps me distribute the load to all of the nodes, but when one goes down, the > app doesn't recover nicely. > > I'm about to write some code in my app to handle this, but before I do, I > thought I'd check for existing solutions and best practices: > > - Is there an existing connection pooling mechanism that someone has created > which handles node failures automatically? > > If not, then I'm looking forward to writing it! > > Thank in advance, > Marc > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
LevelDB and open files count
I'm confused by LevelDB's file handle usage. I had a cluster of one machine running happily with two keys (both in the same bucket). I attached a second node and the first node halted with "too many open files". If I try to restart the service, it runs for a few seconds and then exits again with the same error. The only config settings I've changed from the defaults are the binding ports (to attach to more than just the loopback interface) and the ring creation size (I upped it to 256). In reading the LevelDB page on the wiki, it indicates as max ring size goes up, the number of open files should go down. However, later (in the "Tips and tricks" section) it says that there will be (by default) 20 open files per partition… does this mean I'll have 20*256 = 5120 open files, potentially? Or am I reading this wrong? === jeffrey k eliasen Find and follow me on: Blog: http://jeff.jke.net Twitter: http://twitter.com/jeffreyeliasen Facebook: http://facebook.com/jeffrey.eliasen ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Best practices for using the PB client
On 30 dec 2011, at 20:31, Andrew Berman wrote: > > Also, shameless plug, I have a pooler as well which has a few more options > than pooler. You can check it out here: https://github.com/aberman/pooly Nice, I'll look in to that. > --Andrew > > On Fri, Dec 30, 2011 at 9:58 AM, Marc Campbell wrote: > Hey all, > > I'm looking for some best practices in handling connections when using the > protocol buffer client. Specifically, I have 3 nodes in my cluster, and need > to figure out how to handle the situation when one of the nodes is down. > > I'm currently using a pooler app (https://github.com/seth/pooler) and this > helps me distribute the load to all of the nodes, but when one goes down, the > app doesn't recover nicely. Hi, I ran in to that exakt problem so I've forked and patched that pooler here: https://github.com/bipthelin/pooler I also added OJ's patch which let's you do oneliners like: pooler:use_member(fun(RiakPid) -> riakc_pb_socket:get(RiakPid, Bucket, Key) end). Also, there's this pooler: https://github.com/devinus/poolboy -- Bip Thelin Evolope AB | Lugnets Allé 1 | 120 33 Stockholm Tel 08-533 335 37 | Mob 0735-18 18 90 www.evolope.se ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com