Thank you all for you kind and quick answers However , even on a 3 node or 5 node cluster We're still seeing memory bloat ( only much notably slower , as load is distributed between more machines )
it's important to stress , this is an "read-append" only cluster - This means the data never expires , and from the moment the cluster is up , we keep adding data in the form of S3 puts ( of around 9MB objects ) , until we reach around 300K PUTS This is also why merges don't happen ( no stale data ) Has anyone come across this situation in the past ? does Riak even fit for something like this ? Regards, Idan Shinberg System Architect Idomoo Ltd. Mob +972.54.562.2072 email idan.shinb...@idomoo.com web www.idomoo.com [image: Description: cid:FC66336E-E750-4C2D-B6E3-985D5A06B5BE@idomoo.co.il] On Tue, Aug 20, 2013 at 11:32 AM, Erik Søe Sørensen <e...@trifork.com> wrote: > Your max file size is (far!) less than your small file size threshold - > which means that at each merge, *all* of the files will participate in the > merge. No wonder you need a lot of simultaneously open files... and long > merge times too, of course. > Try changing these parameters. > > > > -------- Oprindelig meddelelse -------- > Fra: Idan Shinberg <idan.shinb...@idomoo.com> > Dato: > Til: riak-users <riak-users@lists.basho.com> > Cc: Arik Katsav <a...@idomoo.com>,Assaf Fogel <as...@idomoo.com> > Emne: Riak Memory Bloat issues with RiakCS/BitCask > > > Hi all > > We have a ~300GB Riak Single Node Cluster > This seems to have worked fine ( merging worked good ) until an > OpenFile/OpenPorts limit was reached ( since then , we've tweaked both to > 64K ) > The above error caused a crash that left corrupted hint files .We've > deleted the hint ( and their corrosponding the data files ) to allow a > clean start to riak ( no errors upon start) . > > However , merges have not been really working ( taking forever to > complete ) since then , therefor causing : > > * Huge Bloat on disk ( Data is around 150K objects of roughly 8MB each > , but has already more then quadrupled in size the riak storage used ( > around 1.2 TB ) > * Huge Bloat in memory , which eventually kills riak itself ( OOM > killer ) > > We're not doing anything complex , just using riak and riak-cs to emulate > S3 access ( and only it ) for roughly 15 client writes per minute. > > Our merge settings ( uber-low , but have worked correctly in the up till a > few days ago ) : > > {riak_kv, [ > %% Storage_backend specifies the Erlang module defining the > storage > %% mechanism that will be used on this node. > {add_paths, ["/usr/lib64/riak-cs/lib/riak_cs-1.3.1/ebin"]}, > {storage_backend, riak_cs_kv_multi_backend}, > {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]}, > {multi_backend_default, be_default}, > {multi_backend, [ > {be_default, riak_kv_eleveldb_backend, [ > {max_open_files, 50}, > {data_root, "/var/lib/riak/leveldb"} > ]}, > {be_blocks, riak_kv_bitcask_backend, [ > > {max_file_size, 16#2000000}, %% 32MB > > %% Trigger a merge if any of the following are > true: > {frag_merge_trigger, 10}, %% fragmentation >= 10% > {dead_bytes_merge_trigger, 8388608}, %% dead bytes > > 8 MB > > %% Conditions that determine if a file will be > examined during a merge: > {frag_threshold, 5}, %% fragmentation >= 5% > {dead_bytes_threshold, 2097152}, %% dead bytes > 2 > MB > {small_file_threshold, 16#80000000}, %% file is < > 2GB > > {data_root, "/var/lib/riak/bitcask"}, > {log_needs_merge, true} > > > ]} > ]}, > > As you've noticed , log_needs_merge is set to true and we do get our logs > filled with needs_merge messages such as this one : > > ,{"/var/l...",...},...] > 2013-08-19 00:09:49.043 [info] <0.17972.0> > "/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728" > needs_merge: > [{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1153.bitcask.data",[{small_file,20506434}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1152.bitcask.data",[{small_file,33393237}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1151.bitcask.data",[{small_file,33123254}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1150.bitcask.data",[{small_file,32505520}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1149. > ... > ... > ... > > Yet a merge Only a single merge happened ( and only after around 20 > minutes since we started putting pressure on the riak) : > > 2013-08-19 00:17:29.456 [info] <0.18964.14> Merged > {["/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/712.bitcask.data","/var/lib/riak/ > > bitcask/388211372416021087647853783690262677096107081728/711.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/710.bitcask > > .data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/709.bitcask.data","/var/lib/riak/bitcask/38821137241602108764785378369026267709 > > 6107081728/708.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/707.bitcask.data","/var/lib/riak/bitcask/388211 > ... > ... > ... > var/lib/riak/bitc > > ask/388211372416021087647853783690262677096107081728/697.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/696.bitcask.dat > > a","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/695.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107 > > 081728/694.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/693.bitcask.data","/var/lib/riak/bitcask/38821137241602108764 > > 7853783690262677096107081728/692.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/691.bitcask.data","/var/lib/riak/bitcas > k/38821137241602108...",...],...} in 1325.611982 seconds. > > Is it reasonable for a merge to take more then 20 minutes ? > Especially assuming riak's memory usage is bloating much faster ? > Will Scaling the cluster from a single node to a 3-node cluster ease the > problem ? > > As for the server and usage specs > > - Virtual machine having around 8 virtual cores > - 12 GB of RAM > - 8 TB of Storage composed of 4 x 2TB disks in Raid 10 ( 4TB available > storage ) > - ~150 keys several 10s of bytes long ( using Riak-CS for s3 storage ) . > - ~8MB value size for each key ( raw file ) > - ~22000 Open files ( mostly hint files ) by riak > - Replication factor of 1 > - Ring size is 64 > > I'll provide the logs if needed , yet I doubt they'll prove useful . > > Any ideas/advice will be appreciated > > > Regards, > > Idan Shinberg > > > System Architect > > Idomoo Ltd. > > > > Mob +972.54.562.2072 > > email idan.shinb...@idomoo.com<mailto:idan.shinb...@idomoo.com> > > web www.idomoo.com<http://www.idomoo.com/> > > [cid:image001.jpg@01CE98F3.21F78EB0] >
<<image001.jpg>>
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com