Hi Julian, You will need to update the app.config file and restart the servers in order for the changes to take effect.
Best regards, Christian On 17 May 2013, at 14:05, Julien Genestoux <julien.genest...@gmail.com> wrote: > Great! Thanks Christian, is that something I can change at runtime or do I > have to stop the server? > Also, would it make sense to change the backend if we have a lot of delete? > > Thanks, > > On Fri, May 17, 2013 at 2:45 PM, Christian Dahlqvist <christ...@basho.com> > wrote: > Hi Julien, > > I believe from an earlier email that you are using bitcask as a backend. This > works with immutable append-only files, and data that is deleted or > overwritten will stay in the files and take up disk space until the file is > closed and can be merged. The max file size is by default 2GB, but this and > other parameters determining how and when merging of closed files is > performed can be tuned. Please see > http://docs.basho.com/riak/latest/tutorials/choosing-a-backend/Bitcask/ for > further details. > > If you wish to reduce the amount of disk space used, you may want to set a > smaller max file size in order to allow merging to occur more frequently. > > Best regards, > > Christian > > > > On 17 May 2013, at 13:06, Julien Genestoux <julien.genest...@gmail.com> wrote: > >> Christian, All >> >> Our servers still have not died... but we see another strange behavior: our >> data store needs a lot more space that what we expect. >> >> Based on the status command, the average size of our object >> (node_get_fsm_objsize_mean) is about 1500 bytes. >> We have 2 buckets, but both of them have a n value of 3. >> >> When we count the values in each of the buckets (using the following >> mapreduce) >> curl -XPOST http://192.168.134.42:8098/mapred -H 'Content-Type: >> application/json' -d >> '{"inputs":"BUCKET","query":[{"reduce":{"language":"erlang","module":"riak_kv_mapreduce","function":"reduce_count_inputs","arg":{"do_prereduce":true}}}],"timeout": >> 100000}' >> >> We get 194556 for one and 1572661 for the other one (these numbers are >> consistent with what we expected), so if our math is right, we do need a >> total disk of >> 3 * (194556 + 1572661 ) * 1500 bytes = 7.4 GB. >> >> Now, though, when I inspect the storage actually occupied on our hard >> drives, we see something weird: >> (this is the du output) >> riak1. 2802888 /var/lib/riak >> riak2. 4159976 /var/lib/riak >> riak5. 4603312 /var/lib/riak >> riak3. 4915180 /var/lib/riak >> riak4. 37466784 /var/lib/riak >> >> As you can see not all nodes have the same "size". What's even weirder is >> that up until a couple hours ago, they were all growing "together" and close >> to what the riak4 node shows. Could this be due to the "delete" policy? It >> turns out that we delete a lot of items (is there a way to get the list of >> commands sent to a node/cluster?) >> >> Thanks! >> >> >> >> On Wed, May 15, 2013 at 11:29 PM, Julien Genestoux >> <julien.genest...@gmail.com> wrote: >> Christian, all, >> >> Not sure what kind of magic happend, but no server died in the last 2 >> days... and counting. >> We have not changed a single line of code, which is quite odd... >> I'm still monitoring everything and hope (sic!) for a failure soon so we can >> fix the problem! >> >> Thanks >> >> >> >> >> -- >> Got a blog? Make following it simple: https://www.subtome.com/ >> >> Julien Genestoux, >> http://twitter.com/julien51 >> >> +1 (415) 830 6574 >> +33 (0)9 70 44 76 29 >> >> >> On Tue, May 14, 2013 at 12:31 PM, Julien Genestoux >> <julien.genest...@gmail.com> wrote: >> Thanks Christian. >> We do indeed use mapreduce but it's a fairly simple function: >> We retrieve a first object whose value is an array of at most 10 ids and >> then we fetch all the values for these 10 ids. >> However, this mapreduce job is quite rare (maybe 10 times a day at most at >> this point...) so I don't think that's our issue. >> I'll try to run the cluster without any call to that to see if that's >> better, but I'd be very surprised. Also, we were doing this already even >> before we allowed for multiple value and the cluster was stable back then. >> We do not do key listing or anything like that. >> >> I'll try looking at the statistics too. >> >> Thanks, >> >> >> >> >> On Tue, May 14, 2013 at 11:50 AM, Christian Dahlqvist <christ...@basho.com> >> wrote: >> Hi Julien, >> >> The node appear to have crashed due to inability to allocate memory. How are >> you accessing your data? Are you running any key listing or large MapReduce >> jobs that could use up a lot of memory? >> >> In order to ensure that you are efficiently resolving siblings I would >> recommend you monitor the statistics in Riak >> (http://docs.basho.com/riak/latest/cookbooks/Statistics-and-Monitoring/). >> Specifically look at node_get_fsm_objsize_* and node_get_fsm_siblings_* >> statistics in order to identify objects that are very large or have lots of >> siblings. >> >> Best regards, >> >> Christian >> >> >> >> On 13 May 2013, at 16:44, Julien Genestoux <julien.genest...@gmail.com> >> wrote: >> >>> Christian, All, >>> >>> Bad news: my laptop is completely dead. Good news: I have a new one, and >>> it's now fully operational (backups FTW!). >>> >>> The log files have finally been uploaded: >>> https://www.dropbox.com/s/j7l3lniu0wogu29/riak-died.tar.gz >>> >>> I have attached to that mail our config. >>> >>> The machine is a virtual Xen instance at Linode with 4GB of memory. I know >>> it's probably not the very best setup, but 1) we're on a budget and 2) we >>> assumed that would fit our needs quite well. >>> >>> Just to put things in more details. Initially we did not use allow_mult and >>> things worked out fine for a couple of days. As soon as we enabled >>> allow_mult, we were not able to run the cluster for more then 5 hours >>> without seeing failing nodes, which is why I'm convinced we must be doing >>> something wrong. The question is: what? >>> >>> Thanks >>> >>> >>> On Sun, May 12, 2013 at 8:07 PM, Christian Dahlqvist <christ...@basho.com> >>> wrote: >>> Hi Julien, >>> >>> I was not able to access the logs based on the link you provided. >>> >>> Could you please attach a copy of your app.config file so we can get a >>> better understanding of the configuration of your cluster? Also, what is >>> the specification of the machines in the cluster? >>> >>> How much data do you have in the cluster and how are you querying it? >>> >>> Best regards, >>> >>> Christian >>> >>> >>> >>> On 12 May 2013, at 19:11, Julien Genestoux <julien.genest...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> We are running a cluster of 5 servers, or at least trying to, because >>>> nodes seem to be dying 'randomly' >>>> without us knowing any reason why. We don't have a great Erlang guy >>>> aboard, and the error logs are not >>>> that verbose. >>>> So I've just .tgz the whole log directory and I was hoping somebody could >>>> give us a clue. >>>> It's there: https://www.dropbox.com/s/z9ezv0qlxgfhcyq/riak-died.tar.gz >>>> (might not be fully uploaded to dropbox yet!) >>>> >>>> I've looked at the archive and some people said their server was dying >>>> because some object's size was just >>>> too big to allocate the whole memory. Maybe that's what we're seeing? >>>> >>>> As one of our buckets is set with allow_mult, I am tempted to think that >>>> some object's size may be exploding. >>>> However, we do actually try to resolve conflicts in our code. Any idea how >>>> to confirm and then debug that we >>>> have an issue there? >>>> >>>> >>>> Thanks a lot for your precious help... >>>> >>>> Julien >>>> >>>> >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> riak-users@lists.basho.com >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> <app.config> >> >> >> >> > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com