Re: Adding New Node To Cluster
The max_concurrency error on handoff is because, by default, Riak allows only 2 handoffs occurring at a time, and additional handoff requests will be rejected. You can change this setting in order to increase the number of simultaneous transfers, at the expense of some cluster performance (as handoffs will take more network/cpu/disk resources. Please take a look at http://docs.basho.com/riak/latest/ops/running/handoff for some additional details, and let us know if you need any more help. Doug Rohrer On Mon, Sep 14, 2015 at 7:28 PM ender wrote: > I added a 6th node to a 5 node cluster, hoping to rebalance the cluster > since I was approaching maximum disk usage on the original 5 nodes. Looks > like the rebalancing is not taking place, and I see a whole bunch of these > in the console logs: > > 688728495783936 was terminated for reason: {shutdown,max_concurrency} > 2015-09-14 23:25:04.188 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode 91343852333181432387730302044767688728495783936 > was terminated for reason: {shutdown,max_concurrency} > 2015-09-14 23:25:04.188 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode > 548063113999088594326381812268606132370974703616 was terminated for reason: > {shutdown,max_concurrency} > 2015-09-14 23:25:14.189 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode > 548063113999088594326381812268606132370974703616 was terminated for reason: > {shutdown,max_concurrency} > 2015-09-14 23:25:14.189 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode 91343852333181432387730302044767688728495783936 > was terminated for reason: {shutdown,max_concurrency} > 2015-09-14 23:25:24.189 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode > 548063113999088594326381812268606132370974703616 was terminated for reason: > {shutdown,max_concurrency} > 2015-09-14 23:25:24.189 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode 91343852333181432387730302044767688728495783936 > was terminated for reason: {shutdown,max_concurrency} > 2015-09-14 23:25:34.190 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode > 548063113999088594326381812268606132370974703616 was terminated for reason: > {shutdown,max_concurrency} > 2015-09-14 23:25:34.190 [info] > <0.183.0>@riak_core_handoff_manager:handle_info:286 An outbound handoff of > partition riak_search_vnode 91343852333181432387730302044767688728495783936 > was terminated for reason: {shutdown,max_concurrency} > > Any pointers to what's going on? > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: how to determine riak version
David: riak_kv is actually a (semi)independent library we incorporate (along with many others) into Riak, the product. There are times where riak_kv, the library, is actually not versioned when we release a new Riak, so those two version numbers are not necessarily going to always be the same. Doug Rohrer > On Apr 22, 2016, at 2:28 PM, David Byron wrote: > > I'm pondering upgrading riak from 2.1.3 to 2.1.4 and got myself confused > confirming that I really am running 2.1.3 at the moment. > > I installed riak from here: > https://packagecloud.io/basho/riak/packages/ubuntu/trusty/riak_2.1.3-1_amd64.deb. > > and all of this looks promising: > > $ riak version > 2.1.3 > > $ ls /usr/lib/riak/releases/ > 2.1.3 RELEASES start_erl.data > > $ dpkg -l | grep riak > ii riak 2.1.3-1 amd64Riak is a distributed data store > > but then there's also this: > > $ sudo riak-admin status | grep riak_kv_version > riak_kv_version : <<"2.1.2-0-gf969bba">> > > I really wanted riak_kv_version to say 2.1.3-. > > I'm clearly paranoid, but can someone help me feel better about this? > > Thanks much. > > -DB > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Precommit hook function - no error log - how to debug?
As to the SASL logging, unfortunately it's not "on by default" and the setting in riak.conf, as you found out, doesn't work correctly. However, you can enable SASL via adding a setting to your advanced.config: {sasl,[{sasl_error_logger,tty}]} %% Enable TTY output for the SASL app {sasl,[{sasl_error_logger,{file, "/path/to/log"}]} %% Enable SASL and output to "/path/to/log" file We're evaluating if we shouldn't just remove the sasl setting from riak.conf altogether, as you're the first person (that we know of) since 2012 that has tried to turn it on and noticed this bug. Doug On Wed, May 11, 2016 at 10:14 AM Luke Bakken wrote: > Hi Sanket - > > I'd like to confirm some details. Is this a one-node cluster? Did you > install an official package or build from source? > > Thanks - > -- > Luke Bakken > Engineer > lbak...@basho.com > > > On Tue, May 10, 2016 at 6:49 PM, Sanket Agrawal > wrote: > > One more thing - I set up the hooks by bucket, not bucket type. The > > documentation for 2.1.4 says that hooks are defined on the bucket level. > > Here is how I set up precommit hook (derived from "Riak Handbook" p95): > > > > curl -X PUT localhost:8098/types/test_kv_wo/buckets/uuid_log/props -H > > 'Content-Type: application/json' -d '{ "props": { "precommit": [{"mod": > > "precommit", "fun": "pre_uuid"}]}}' -v > > > > > > On Tue, May 10, 2016 at 9:15 PM, Sanket Agrawal < > sanket.agra...@gmail.com> > > wrote: > >> > >> I just set up a precommit hook function in dev environment (KV 2.1.4) > >> which doesn't seem to be triggering off at all. The object is being > stored > >> in the bucket, but the precommit logic is not kicking off. I checked > couple > >> of things as listed below but came up with no error - so, it is a > >> head-scratcher why precommit hook is not triggering: > >>> > >>> - Verify precommit is set in bucket properties - snippet from curl > query > >>> for bucket props below: > >>> "precommit":[{"mod":"precommit","fun":"pre_uuid"}] > >>> > >>> - check there is no error in logs > >>> > >>> - check riak-console for commit errors: > >>> $ ./riak1/bin/riak-admin status|grep commit > >>> postcommit_fail : 0 > >>> precommit_fail : 0 > >>> > >>> - Run the precommit function manually on Riak console itself with a > riak > >>> object (that the hook failed to trigger on), and verify it works > >> > >> > >> > >> Also, there is no sasl-error.log. "sasl = on" doesn't work in 2.1.4 > >> because it fails with bad_config error. So, I am assuming sasl logging > is > >> enabled by default. > >> > >> Here is what precommit function does: > >> - For the object (an immutable log append of JSON), calculate the > location > >> of a LWW bucket, and update a easily calculated key with that JSON > body. It > >> works fine from Riak console itself. Code below - we call pre_uuid in > >> precommit hook - both precommit.beam (where the function is) and > rutils.beam > >> have been copied to the relevant location as set in riak config, are > >> accessible through Riak console and work fine if manually executed on an > >> object: > >> > >>> %% Preprocess JSON, and copy to a LWW bucket type > >>> preprocessJ(RObj,B,Choplen) -> > >>> Bn = {rutils:calcBLWWType(RObj),B}, %%this returns the location of > LWW > >>> bucket - works fine in riak console > >>> %% We store uuid map in key - we take out timestamp of > >>> length 32 including "_" > >>> K = riak_object:key(RObj), > >>> Kn = binary:part(K,0,byte_size(K) - Choplen), > >>> NObj = > >>> > riak_object:new(Bn,Kn,riak_object:get_value(RObj),riak_object:get_metadata(RObj)), > >>> {ok, C} = riak:local_client(), > >>> case C:put(NObj) of > >>> ok -> RObj; > >>> _ -> {fail,<<"Error when trying to process in precommit hook">>} > >>> end. > >>> > >>> pre_uuid(RObj) -> preprocessJ(RObj,<<"uuid_latest">>,32). > >> > >> > >> Below is a manual execution from riak console of precommit function - > >> first we execute it to confirm it is returning the original object: > >>> > >>> (riak1@127.0.0.1)5> precommit:pre_uuid(O1). > >>> {r_object,{<<"test_kv_wo">>,<<"uuid_log">>}, > >>> <<"ahmed_2016-05-10T20%3a47%3a47.346299Z">>, > >>> [{r_content,{dict,3,16,16,8,80,48, > >>> > >>> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...}, > >>> > >>> {{[],[],[],[],[],[],[],[],[],[],[[...]|...],[],...}}}, > >>> > >>> > <<"{\"uname\":\"ahmed\",\"uuid\":\"df8c10e0-381d-5f65-bf43-cb8b4cb806fc\",\"timestamp\":\"2016-05-"...>>}], > >>> [{<<0>>,{1,63630132467}}], > >>> {dict,1,16,16,8,80,48, > >>> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...}, > >>> {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}}, > >>> undefined} > >> > >> > >> Now, we check if the object has been written to test_lww/uuid_latest > >> bucket type: > >>> > >>> (riak1@127.0.0.1)6> > >>> C:get({<<"test_lww">>,<<"uuid_latest">>},<<"ahmed">>). > >>> > >>> {ok,{r_object,{<<"hnm_fsm_lww">>,<<"uuid_latest">>}, > >>> <<"ahmed">>, >
Re: Precommit hook function - no error log - how to debug?
Huh - I get a huge amount of logging when I turn on sasl using advanced.config - specifically, I have: {sasl,[{sasl_error_logger,{file, "/tmp/sasl-1.log"}}]} in my advanced.config, and for just a startup/shutdown cycle I get a 191555 byte file. Just to confirm that you can, in fact, load the modules in question, what happens if you do a `riak-admin attach` and do `m(precommit).` what do you see? Doug On Wed, May 11, 2016 at 11:32 AM Sanket Agrawal wrote: > Thanks, Doug. I have enabled sasl logging now through advanced.config > though it doesn't seem to be creating any log yet. > > If this might help you folks with debugging precommit issue, what I have > observed is that erl-reload command doesn't load the precommit modules for > any of the three nodes (though precommit hook has been enabled on one of > the buckets for testing). > >> $ ~/riak/riak1/bin/riak-admin erl-reload >> Module precommit not yet loaded, skipped. >> Module rutils not yet loaded, skipped. > > > > On Wed, May 11, 2016 at 2:05 PM, Douglas Rohrer wrote: > >> As to the SASL logging, unfortunately it's not "on by default" and the >> setting in riak.conf, as you found out, doesn't work correctly. However, >> you can enable SASL via adding a setting to your advanced.config: >> >> {sasl,[{sasl_error_logger,tty}]} %% Enable TTY output for the SASL app >> {sasl,[{sasl_error_logger,{file, "/path/to/log"}]} %% Enable SASL and >> output to "/path/to/log" file >> >> We're evaluating if we shouldn't just remove the sasl setting from >> riak.conf altogether, as you're the first person (that we know of) since >> 2012 that has tried to turn it on and noticed this bug. >> >> Doug >> >> On Wed, May 11, 2016 at 10:14 AM Luke Bakken wrote: >> >>> Hi Sanket - >>> >>> I'd like to confirm some details. Is this a one-node cluster? Did you >>> install an official package or build from source? >>> >>> Thanks - >>> -- >>> Luke Bakken >>> Engineer >>> lbak...@basho.com >>> >>> >>> On Tue, May 10, 2016 at 6:49 PM, Sanket Agrawal >>> wrote: >>> > One more thing - I set up the hooks by bucket, not bucket type. The >>> > documentation for 2.1.4 says that hooks are defined on the bucket >>> level. >>> > Here is how I set up precommit hook (derived from "Riak Handbook" p95): >>> > >>> > curl -X PUT localhost:8098/types/test_kv_wo/buckets/uuid_log/props -H >>> > 'Content-Type: application/json' -d '{ "props": { "precommit": [{"mod": >>> > "precommit", "fun": "pre_uuid"}]}}' -v >>> > >>> > >>> > On Tue, May 10, 2016 at 9:15 PM, Sanket Agrawal < >>> sanket.agra...@gmail.com> >>> > wrote: >>> >> >>> >> I just set up a precommit hook function in dev environment (KV 2.1.4) >>> >> which doesn't seem to be triggering off at all. The object is being >>> stored >>> >> in the bucket, but the precommit logic is not kicking off. I checked >>> couple >>> >> of things as listed below but came up with no error - so, it is a >>> >> head-scratcher why precommit hook is not triggering: >>> >>> >>> >>> - Verify precommit is set in bucket properties - snippet from curl >>> query >>> >>> for bucket props below: >>> >>> "precommit":[{"mod":"precommit","fun":"pre_uuid"}] >>> >>> >>> >>> - check there is no error in logs >>> >>> >>> >>> - check riak-console for commit errors: >>> >>> $ ./riak1/bin/riak-admin status|grep commit >>> >>> postcommit_fail : 0 >>> >>> precommit_fail : 0 >>> >>> >>> >>> - Run the precommit function manually on Riak console itself with a >>> riak >>> >>> object (that the hook failed to trigger on), and verify it works >>> >> >>> >> >>> >> >>> >> Also, there is no sasl-error.log. "sasl = on" doesn't work in 2.1.4 >>> >> because it fails with bad_config error. So, I am assuming sasl >>> logging is >>> >> enabled by default. >>> >> >>> >> Here is what precommit function does: >>> >&g
[ANN] Basho move Lager to erlang-lager organization on Github
Recognizing that Lager has long-since become an important open-source tool for Erlang developers, the team at Basho is happy to announce we have created the Erlang-Lager organization on Github to open up Lager to encourage broader community involvement: https://github.com/erlang-lager/lager The primary maintainers will continue to be Mark Allen and John Daily. Please reach out to them to coordinate your involvement going forward. You’ll notice the original lager repo is owned by the organization and we’ve forked a copy back to Basho. All your existing forks will work just fine, though you may need to update your remote URLs (if you were previously pushing directly to the Basho repo): ➜ lager git:(develop) git remote -v upstream https://github.com/basho/lager.git (fetch) upstream https://github.com/basho/lager.git (push) ➜ lager git:(develop) git remote set-url upstream https://github.com/erlang-lager/lager.git ➜ lager git:(develop) git remote -v upstream https://github.com/erlang-lager/lager.git (fetch) upstream https://github.com/erlang-lager/lager.git (push) "upstream" may be different depending on your personal setup. You will also want to update any existing build tools (rebar, mix, erlang.mk, etc.) in your projects that point to the Basho clone of the repo to instead use the erlang-lager organization's repo. I want to thank Mark and John for being willing to continue their excellent work on Lager for the wider community, along with all of our other outside contributors. We all believe a larger maintainer base on Lager will continue to improve Lager and our community as a whole. Please share with any interested parties that may not see this announcement. Best, Doug Rohrer Principal Engineer Basho ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Handoffs are too slow after netsplit
Andrey: It's waiting for 60 seconds, literally... See https://github.com/basho/riak_core/search?utf8=%E2%9C%93&q=vnode_inactivity_timeout - handoff is not initiated until a vnode has been inactive for the specified inactivity period. For demonstration purposes, if you want to reduce this time, you could set the riak_core.vnode_inactivity_timeout period lower ,which can be set in advanced.config. Also note that, depending on the backend you use, it's possible to have other settings set lower than the vnode inactivity timeout, you can actually prevent handoff completely - see http://docs.basho.com/riak/kv/2.2.0/setup/planning/backend/bitcask/#sync-strategy, for examnple. Hope this helps. Doug On Thu, Feb 23, 2017 at 6:40 AM Andrey Ershov wrote: Hi, guys! I'd like to follow up on handoffs behaviour after netsplit. The problem is that right after network partition is healed, "riak-admin transfers" command says that there are X partitions waiting transfer from one node to another, and Y partitions waiting transfer in the opposite direction. What are they waiting for? Active transfers section is always empty. It takes about 1 minute for transfer to occur. I've increased transfer_limit to 100 and it does not help. Also I've tried to attach to Erlang VM and execute riak_core_vnode_manager:force_handoff() on each node. This command returns 'ok'. But seems that it does not work right after network is healed. After some time 30-60 s, force_handoff() works as expected, but actually it's the same latency as in auto handoff case. So what is it waiting for? Any ideas? I'm preparing real-time coding demo to be shown on the conference. So it's too much time to wait for 1 minute for handoff to occur just for a couple of keys... -- Thanks, Andrey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Kafka Connector For RiakTS?
This is, currently, a private repo. I will look into the viability of making it public, but no promises at this point. Doug > On Apr 6, 2017, at 9:13 AM, Andrei Zavada > wrote: > > Can you please retry? I can confirm it didn't work a few hours ago, > but it is now. > > On Thu, Apr 6, 2017 at 12:33 PM, Grigory Fateyev wrote: >> 404 Error >> >> 2017-04-06 12:04 GMT+03:00 Andrei Zavada : >>> >>> Hi Joe, >>> >>> Sorry for letting your question remain unanswered for so long. There >>> is indeed a tool you might find useful: >>> https://github.com/basho-labs/kafka-connect-riak, but please be aware >>> that it is provided with no official support from Basho. >>> >>> Regards, >>> Andrei >>> >>> On Wed, Mar 29, 2017 at 9:37 PM, Joe Olson wrote: Is there a Kafka Connect (https://www.confluent.io/product/connectors/) connector for RiakTS? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> ___ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Siblings on first write to a key
This sounds like an issue our Riak CS team ran into quite a while ago, which involved “slow nodes” and coordination retry. Take a look at https://github.com/basho/riak_kv/issues/1188 and see if it makes sense to you, but it certainly sounds like what’s happening. The basic flow of the issue comes when one node in the preflist is down, and you write to a node _not in the preflist_, at which point the following happens (better formatted in the issue above, btw): clientnode-A node-R node-S ---(Put)--> Compute PL = P, Q and R Redirect to R ---> [frozen] | | 3 sec timeout V Compute new PL excluding R = P, Q and S Redirect to S > Compute PL without | any knowlege about R (at this point) | = P, Q and R | Redirect to R ---+ | | | | [what happnes?] <-|-+ | | 3 sec timeout | V | Compute new PL excluding R | = P, Q and S | I'm coordinator this time | Execute put V 3 sec timeout Compute new PL again [continues] So, it’s possible for a slow/down node (node R in this case) to eventually cause two _other nodes_ to each write a sibling, even on a new key. In fact, depending on the number of nodes in the system and your luck, you could end up writing more than one sibling on a fresh write in this case. Given your comment about a network issue potentially being a factor, and the 3-second timing you noted (the default for the failure timeout), this increases the likelihood that this was, in fact, the issue. A fix for this issue has been worked on and tested, but is not yet incorporated into a version of Riak for distribution. You can, however, disable the coordinator retry logic as noted in the issue I referenced above, or increase the timeout if your cluster is running slowly in general by setting `riak_kv`, `put_coordinator_failure_timeout` in your `advanced.config` file (see http://docs.basho.com/riak/kv/2.2.3/configuring/reference/#advanced-configuration for the general format of the advanced.config if you’re not familiar). Hope this helps. Doug Rohrer On 4/18/17, 8:28 AM, "riak-users on behalf of Daniel Abrahamsson" wrote: Hi Magnus, This cluster has been running in production for a few months. Key generation is based on flake (https://github.com/boundary/flake); we have never experienced a collision in the 3+ years we have been using it heavily in production. However, I will look into that possibility as well. I just noticed that one of the Riak nodes logged this at the time: 2017-04-13 17:42:40.567 [error] <0.3624.28>@riak_api_pb_server:handle_info:331 Unrecognized message {30320806,{ok,{r_object,<<"session">>,<<".12011742tWzDvu8mk5WAdfYihfV_T3DcnJ5VDyXC0c">>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<"X-Riak-VTag">>,53,114,86,115,108,71,120,112,73,55,108,118,114,100,105,114,107,104,50,66,105,119]],[[<<"index">>]],[],[[<<"X-Riak-Last-Modified">>|{1492,105357,453143}]],[],[]}}},<<... (actual value removed). I also have another example (from the same cluster) where there is a *single* writer to a key, but after a few writes/updates, it also got a sibling error. Also at that time, the write+read took significantly longer than normal. I'll check if we had any "unrecognized messages" in the Riak logs at that time as well. To answer your second question, we are talking to the riak cluster over protocol buffers, using the official Erlang client. //Daniel On Tue, Apr 18, 2017 at 1:51 PM, Magnus Kessler wrote: > On 18 April 2017 at 08:20, Daniel Abrahamsson wrote: >> >> I've run into a case where I got a sbiling error/response on the first >> ever write to a key. I would like to understand how this could happen. >> Normally when you get siblings, it is because you have written a value >> with an out-of-date vclock. But since this is the first write, there >> is no vclock. Could someone shed some light on this for me? >> >> It is worth to mention that the it took 3 seconds for Riak to deliver >> the response, so it is possible there was some kind of network issue >> at the time. >> >> Here are some details about my set