I turned the triggers and thresholds down, to: {max_file_size, 805306368}, %% 768 MB {dead_bytes_merge_trigger, 134217728}, %% dead bytes > 128 MB {dead_bytes_threshold, 33554432} %% dead bytes > 32 MB
And restarted nodes; however after 24 hours, disk utilisation remains the same. ie. For about 6GB of files in CS, 600GB is on disk. (Previously, when we had >100 GB in CS, we had terabytes on disk) I do wonder if we hit some kind of issue with Riak CS earlier in the cluster's life, and have somehow ended up with a lot of "dead" bytes in there. This is just data stored by Riak CS, so it should take responsibility for siblings and their resolution, yes? If I write something to download files and then re-upload them again, to the same path, would that cause Riak CS to fix up any issues around siblings or duplicately-stored data? Cheers, Toby On 22 January 2015 at 03:20, Luke Bakken <lbak...@basho.com> wrote: > You should be able to help with disk usage by "turning down" the > trigger and threshold values described here: > > http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/ > > Your cluster will merge more data which should help with disk usage. > If your typical use is to create and delete objects frequently, this > will help. > > -- > Luke Bakken > Engineer > lbak...@basho.com > > On Wed, Jan 21, 2015 at 4:40 AM, Toby Corkindale <t...@dryft.net> wrote: >> On 21 January 2015 at 15:22, Luke Bakken <lbak...@basho.com> wrote: >>> Hi Toby - >>> >>> Are you using the stock bitcask configuration for merging? >> >> Hi Luke, >> Yes, pretty much. >> >>> On Tue, Jan 20, 2015 at 5:07 PM, Toby Corkindale <t...@dryft.net> wrote: >>>> Hi Kota, >>>> I had a bit of an off-list chat about this a while ago, plus continued >>>> to investigate locally, and eventually achieved some faster speeds, >>>> around 15MByte/sec writes. >>>> Things that were changed: >>>> * Adjusted Riak CS GC to be spread out over the cluster much more. >>>> * Tweaked up the put buffers and concurrency further >>>> * Moved most of the files out of CS and into Amazon S3+Glacier >>>> * Switched from nginx to haproxy >>>> * simplified firewalling for internal clients >>>> >>>> Each one of those changes made a small to modest improvement, but >>>> overall combined to make a quite noticeable improvement. >>>> >>>> I did notice something odd though -- despite moving most of the data >>>> out of the cluster, the disk-space-in-use by Riak is still very large >>>> compared to the amount stored. I mean, we moved more than 90% of the >>>> data out of the cluster, yet the actual disk space used only halved. >>>> For every gigabyte of file stored in CS, dozens of gigabytes are >>>> actually on disk! >>>> >>>> Either the garbage collection algorithm is very, very lazy, or somehow >>>> something has gone a bit wrong in the past, which might have explained >>>> part of the performance problems. >>>> >>>> We're going to look at redeploying a new, fresh cluster based on Riak >>>> 2 in the not too distant future, once Riak CS looks like it's approved >>>> for use there, and maybe that'll clear all of this up. >>>> >>>> Toby -- Turning and turning in the widening gyre The falcon cannot hear the falconer Things fall apart; the center cannot hold Mere anarchy is loosed upon the world _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com