I tried quite hard to get Riak to work reliably in a Docker container, in a long-term-use kind of way. Riak would never shutdown cleanly, though, and so at startup there would always be lots of lock files left around that had to be deleted first.
Riak is not well-behaved after a rough shutdown -- whether in a Docker container, or running on bare metal. Tends to require sysadmin intervention to clean things up. If you're running it in a Docker container, you need to figure out a way to capture the incoming SIGTERM and then use that to shutdown Riak cleanly. I never got that far. I had a start-up script that cleaned out lock files and hash trees and the like, but even after all that, the Dockerised Riak proved problematic. (And getting all the Erlang/OTP clustering networking to work was also painful) Good luck, Toby On Thu, 16 Feb 2017 at 10:03 Jon Brisbin <jbris...@basho.com> wrote: > I haven't tried CS in a container yet. Could you provide the Dockerfiles > and compose files or the commands you use to start the services? > > jb > > On Wed, Feb 15, 2017 at 4:49 PM Jean-Marc Le Roux < > jeanmarc.ler...@aerys.in> wrote: > > Hi, > > inspecting the logs further, I get this in /etc/riak/console.log even > before running riak-admin repair-2i : > > 2017-02-15 23:41:12.441 [warning] <0.714.0> Hintfile > '/var/lib/riak/bitcask/205523667749658222872393179600727299639115513856/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.441 [warning] <0.702.0> Hintfile > '/var/lib/riak/bitcask/22835963083295358096932575511191922182123945984/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.441 [warning] <0.716.0> Hintfile > '/var/lib/riak/bitcask/251195593916248939066258330623111144003363405824/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.441 [warning] <0.717.0> Hintfile > '/var/lib/riak/bitcask/296867520082839655260123481645494988367611297792/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.441 [warning] <0.700.0> Hintfile > '/var/lib/riak/bitcask/91343852333181432387730302044767688728495783936/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.441 [warning] <0.715.0> Hintfile > '/var/lib/riak/bitcask/228359630832953580969325755111919221821239459840/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.442 [warning] <0.697.0> Hintfile > '/var/lib/riak/bitcask/68507889249886074290797726533575766546371837952/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.442 [warning] <0.712.0> Hintfile > '/var/lib/riak/bitcask/159851741583067506678528028578343455274867621888/2.bitcask.hint' > invalid > 2017-02-15 23:41:12.442 [warning] <0.719.0> Hintfile > '/var/lib/riak/bitcask/342539446249430371453988632667878832731859189760/2.bitcask.hint' > invalid > > All of this is very surprising since I started riak-cs and riak properly. > > Then at the end of console.log : > > 2017-02-15 23:41:13.651 [info] <0.481.0>@riak_core:wait_for_service:498 > Wait complete for service riak_kv (10 seconds) > 2017-02-15 23:41:13.652 [info] <0.678.0>@riak_core:wait_for_service:498 > Wait complete for service riak_kv (10 seconds) > 2017-02-15 23:41:13.668 [info] <0.7.0> Application yokozuna started on > node 'riak@127.0.0.1' > 2017-02-15 23:41:13.672 [info] <0.7.0> Application cluster_info started on > node 'riak@127.0.0.1' > 2017-02-15 23:41:13.678 [info] > <0.201.0>@riak_core_capability:process_capability_changes:555 New > capability: {riak_control,member_info_version} = v1 > 2017-02-15 23:41:13.680 [info] <0.7.0> Application riak_control started on > node 'riak@127.0.0.1' > 2017-02-15 23:41:13.680 [info] <0.7.0> Application erlydtl started on node > 'riak@127.0.0.1' > 2017-02-15 23:41:13.687 [info] <0.7.0> Application riak_auth_mods started > on node 'riak@127.0.0.1' > 2017-02-15 23:41:17.714 [info] > <0.474.0>@riak_core_throttle:maybe_log_throttle_change:372 Changing > throttle for riak_kv/aae_throttle from undefined to 0 based on load factor 0 > 2017-02-15 23:41:32.719 [info] > <0.2388.0>@riak_kv_index_hashtree:build_or_rehash:1055 Starting AAE tree > build: 159851741583067506678528028578343455274867621888 > 2017-02-15 23:42:02.186 [info] > <0.2388.0>@riak_kv_index_hashtree:handle_fold_keys_result:629 Finished AAE > tree build: 159851741583067506678528028578343455274867621888 > > I assume it means riak is properly started. > So I start stanchion, then riak-cs. But I still have the exact same > error... > > Regards, > > 2017-02-15 22:16 GMT+01:00 Jean-Marc Le Roux <jeanmarc.ler...@aerys.in>: > > Forgot to mention ACLs are alright AFAIK : > > root@b4394bf1de78:/var/lib/riak# ls -la > total 52 > drwxr-xr-x. 10 riak riak 179 Feb 9 23:43 . > drwxr-xr-x. 1 root root 95 Feb 15 20:48 .. > -r--------. 1 riak riak 20 Feb 9 01:00 .erlang.cookie > drwxrwxr-x. 67 riak riak 8192 Feb 15 21:31 anti_entropy > drwxrwxr-x. 66 riak riak 8192 Feb 9 23:42 bitcask > drwxrwxr-x. 3 riak riak 40 Feb 9 23:42 cluster_meta > drwxrwxr-x. 2 riak riak 225 Feb 15 22:09 generated.configs > drwxrwxr-x. 2 riak riak 8192 Feb 15 22:09 kv_vnode > drwxrwxr-x. 66 riak riak 8192 Feb 9 23:42 leveldb > drwxrwxr-x. 2 riak riak 6 Feb 15 22:14 riak_kv_exchange_fsm > drwxr-xr-x. 2 riak riak 186 Feb 15 22:09 ring > > 2017-02-15 22:13 GMT+01:00 Jean-Marc Le Roux <jeanmarc.ler...@aerys.in>: > > Hi, > > I'll try to send the log archive ASAP. > Here is what I get in /var/log/riak/error.log after running riak-admin > repair-2i : > > 2017-02-15 22:09:06.535 [error] > <0.3287.0>@riak_kv_2i_aae:repair_partition:297 Failed to acquire hashtree > lock on partition 1255977969581244695331291653115555720016817029120 > 2017-02-15 22:09:06.535 [error] > <0.3288.0>@riak_kv_2i_aae:repair_partition:297 Failed to acquire hashtree > lock on partition 1278813932664540053428224228626747642198940975104 > 2017-02-15 22:09:06.535 [error] > <0.3289.0>@riak_kv_2i_aae:repair_partition:297 Failed to acquire hashtree > lock on partition 479555224749202520035584085735030365824602865664 > 2017-02-15 22:09:06.535 [error] > <0.3290.0>@riak_kv_2i_aae:repair_partition:297 Failed to acquire hashtree > lock on partition 502391187832497878132516661246222288006726811648 > 2017-02-15 22:09:06.535 [error] > <0.3291.0>@riak_kv_2i_aae:repair_partition:297 Failed to acquire hashtree > lock on partition 1118962191081472546749696200048404186924073353216 > > I tried to remove all "LOCK" files in /var/lib/riak but to no avail... > I'm guessing there is something here... > > Any idea ? > > 2017-02-09 17:37 GMT+01:00 Luke Bakken <lbak...@basho.com>: > > Hi Jean-Marc - > > Can you provide a complete archive of the log directory? I wonder if > another file might have more information. > > -- > Luke Bakken > Engineer > lbak...@basho.com > > On Thu, Feb 9, 2017 at 1:58 AM, Jean-Marc Le Roux > <jeanmarc.ler...@aerys.in> wrote: > > > > Hello, > > > > here is the original github issue : > > > > https://github.com/basho/riak_cs/issues/1329 > > > > I'm using riak-cs 2.1.1-1.el6 with stanchion 1.5.0-1.el6 on CentOS 6.8 > in a Docker container. > > To make the data persistent, the following directories are mounted from > outside the container : > > > > /var/log > > /var/lib/riak/ > > > > Everything works fine except when I remove/reimport the container. > > Even when it's the same container. > > The riak data is here in /var/lib/riak (bitcask and leveldb stuff). ACLs > look fine on those files. > > > > Riak starts. Stanchion starts. But riak-cs won't start. > > With a riak-cs concole, it looks like the problem is here : > >> > >> (riak-cs@127.0.0.1)1> [os_mon] memory supervisor port (memsup): Erlang > has closed > >> > >> =INFO REPORT==== 18-Jan-2017::09:38:31 === > >> alarm_handler: {clear,system_memory_high_watermark} > >> [os_mon] cpu supervisor port (cpu_sup): Erlang has closed > >> {"Kernel pid > terminated",application_controller,"{application_start_failure,riak_cs,{notfound,{riak_cs_app,start,[normal,[]]}}}"} > > > > var/log/riak-cs/access.log.2017_01_18_09 is empty. > > Here is what /var/log/riak-cs/crash.log says: > >> > >> 2017-01-18 09:38:31 =CRASH REPORT==== > >> crasher: > >> initial call: application_master:init/4 > >> pid: <0.148.0> > >> registered_name: [] > >> exception exit: > {{notfound,{riak_cs_app,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,133}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]} > >> ancestors: [<0.147.0>] > >> messages: [{'EXIT',<0.149.0>,normal}] > >> links: [<0.147.0>,<0.7.0>] > >> dictionary: [] > >> trap_exit: true > >> status: running > >> heap_size: 376 > >> stack_size: 27 > >> reductions: 119 > >> neighbours: > > > > > -- > *Jean-Marc Le Roux* > > > Founder and CEO of Aerys (http://aerys.in) > > Blog: http://blogs.aerys.in/jeanmarc-leroux > Cell: (+33)6 20 56 45 78 <+33%206%2020%2056%2045%2078> > Phone: (+33)9 72 40 17 58 <+33%209%2072%2040%2017%2058> > > > > > -- > *Jean-Marc Le Roux* > > > Founder and CEO of Aerys (http://aerys.in) > > Blog: http://blogs.aerys.in/jeanmarc-leroux > Cell: (+33)6 20 56 45 78 <+33%206%2020%2056%2045%2078> > Phone: (+33)9 72 40 17 58 <+33%209%2072%2040%2017%2058> > > > > > -- > *Jean-Marc Le Roux* > > > Founder and CEO of Aerys (http://aerys.in) > > Blog: http://blogs.aerys.in/jeanmarc-leroux > Cell: (+33)6 20 56 45 78 <+33%206%2020%2056%2045%2078> > Phone: (+33)9 72 40 17 58 <+33%209%2072%2040%2017%2058> > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com