riak crash
Hi all, I have my riak node crashed and can't figure out why this has happened. Any help would be appreciated. I'm using riak 1.3.1 and running a single node on CentOS 6.3 (availability is not critical right now). This node uses Bitcask backend with the default configuration except for expiry_secs property which is set to 15552000. The properties of a single bucket used are the following: {"props":{"allow_mult":false," basic_quorum":false,"big_ vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"c","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"one","rw":"quorum","small_vclock":50,"w":"one","young_vclock":20}} There is enough disk space and RAM on the server. At the moment of the crash size of the most partitions was about 1.1Gb, but few of them were smaller. Attaching error.log and crash.log Thanks in advance. Alexander riak-logs.tar.gz Description: GNU Zip compressed data ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
setting n_val
Hi all, I have a question regarding setting the n_val. In the documentation ( http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/) it is stated that: n_val must be greater than 0 and less than or equal to the number of actual nodes in your cluster to get all the benefits of replication. And, we advise against modifying the n_val of a bucket after its initial creation as this may result in failed reads because the new value may not be replicated to all the appropriate partitions. But this seems contradictory to me. Which value I have to set if I'm setting up one node right now but planning to add second one later? Setting n_val=2 means this value will be greater than actual number of nodes. But setting n_val=1 is also not advisable since I will have to change it later to n_val=2 (I'm planning to have two replicas in the end). I'm also concerned about the performance in case of one node and n_val=2. Will it degrade since both replicas are stored on the same server? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: setting n_val
John, Eric, thank you for your answers. I understood that running Riak on one node is a bad idea. But having 5 nodes in cluster is too expensive for us right now. Which settings I have to tune to get a decent performance on 3 node cluster? 100% availability is not necessary for us but small response time of get requests is "must have". I think I can set n_val=2, r=1 and w=1 to achieve this. Is there something else? On 10 June 2013 20:25, Eric Redmond wrote: > Alexander, > > The simplest answer is that we never recommend running Riak on one node. > The recommended minimum is 5, but you could possibly get away with 3 (the > default repl value). > > There is a blog post about this from last year, explaining why: > http://basho.com/why-your-riak-cluster-should-have-at-least-five-nodes/ > > Eric > > On Jun 10, 2013, at 9:01 AM, Alexander Ilyin > wrote: > > Hi all, > > I have a question regarding setting the n_val. > In the documentation ( > http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/) > it is stated that: > > n_val must be greater than 0 and less than or equal to the number of > actual nodes in your cluster to get all the benefits of replication. > And, we advise against modifying the n_val of a bucket after its initial > creation as this may result in failed reads because the new > value may not be replicated to all the appropriate partitions. > > But this seems contradictory to me. Which value I have to set if I'm > setting up one node right now but planning to add second one later? > Setting n_val=2 means this value will be greater than actual number of > nodes. But setting n_val=1 is also not advisable since > I will have to change it later to n_val=2 (I'm planning to have two > replicas in the end). > > I'm also concerned about the performance in case of one node and n_val=2. > Will it degrade since both replicas are stored on the same server? > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: setting n_val
Dmitri, thanks a lot for such a comprehensive answer! I actually can afford some downtime but retries is not an option for me due to response time requirements. Of course we're going to add nodes in the future but for now we have to deal with what we have. On 10 June 2013 20:42, Dmitri Zagidulin wrote: > Alexander, > > Your question about n_val on a one-node server is very valid (and also the > question of, so how do you migrate to a larger n_val size when you grow > your cluster). > > As an aside -- as John mentioned, Riak is designed from the ground up to > be run on multi-node clusters, so you have to keep that in mind when > choosing to run on just one node (when you're a startup on a budget, and > just testing out an application, etc), in terms of expected performance. > > Anyways, you have two options: > > 1) Start with the eventual n_val in mind (n=3, or n=2 as in your case) and > live with slow performance on a single node. (The upside is - no migration > required when adding new nodes, as the extra replicas will be moved to the > appropriate new machines). > > 2) Start with n_val=1 on a single node. The benefit of this is - faster > performance (less replicas to deal with that don't help when you're on one > node). The drawback is - you need to have some sort of migration strategy > when expanding your Riak deployment to more nodes. > It doesn't have to be complicated, but you will have to deal with the fact > that, if you have a set of data with n=1, and then increase n to 2 in the > app config when you add a new node, half of the vnodes are going to be > missing data for any given request. This is not disastrous, but you do need > to either rely on read-repair to create the missing replicas, or you need > to do it yourself. > Here are the options as I see it: > > a) If you can afford downtime when adding a new cluster, you could back up > the contents of your one-node cluster, then add the new node & up n_val to > 2. And then restore to the new cluster. The writes from the restore are > going to create the new number of replicas (n_val=2). You can you logical > backup tools like 'riak-admin backup' or Riak Data Migrator. (Backing up > and restoring the data directory won't work, for adding a new node). > > b) You can add retries to your application logic. If you get a Not Found > (and you know that the value is supposed to be there), you can re-try the > GET. (After the first get and a 404, read-repair takes place to fill in > missing values, so the second read should be fine). > > c) If you're on 1.3+ and have Active Anti-Entropy enabled, you can > increase n_val to 2, and wait for AAE to fill in the missing replicas (this > should probably still be paired with option b, as you'll need to retry some > reads while it's working). > > The few times that I've built single-node apps (during hackathons, etc), I > usually go with option 2 (a), and backup/restore the data. But your use > case requirements may difer. > > Dmitri > > > > On Mon, Jun 10, 2013 at 12:01 PM, Alexander Ilyin > wrote: > >> Hi all, >> >> I have a question regarding setting the n_val. >> In the documentation ( >> http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/) >> it is stated that: >> >> n_val must be greater than 0 and less than or equal to the number of >> actual nodes in your cluster to get all the benefits of replication. >> And, we advise against modifying the n_val of a bucket after its initial >> creation as this may result in failed reads because the new >> value may not be replicated to all the appropriate partitions. >> >> But this seems contradictory to me. Which value I have to set if I'm >> setting up one node right now but planning to add second one later? >> Setting n_val=2 means this value will be greater than actual number of >> nodes. But setting n_val=1 is also not advisable since >> I will have to change it later to n_val=2 (I'm planning to have two >> replicas in the end). >> >> I'm also concerned about the performance in case of one node and n_val=2. >> Will it degrade since both replicas are stored on the same server? >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
memory consumption
Hi, I have a few questions about Riak memory usage. We're using Riak 1.3.1 on a 3 node cluster. According to bitcask capacity calculator ( http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/) Riak should use about 30Gb of RAM for out data. Actually, it uses about 45Gb and I can't figure out why. I'm looking at %MEM column in top on each node for a beam.smp process. We have: * about 235,000,000 keys * one bucket with 1-byte name * average key size about 24 bytes * n_val = 2 Bucket properties: {"props":{"allow_mult":false,"basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"c","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"one","rw":"quorum","small_vclock":50,"w":"one","young_vclock":20}} Disk usage is also about 1,5 times more than I have expected (270Gb instead of 180Gb). I rechecked that I have n_val=2 (not 3), it seems alright. Why this could happen? Second question is about performance degradation when Riak uses almost all available memory on the node. We see that 95/99 put percentiles twice as large for nodes which don't have much free RAM. How much free memory I should have to keep performance high? And the last question about memory_total metric. riak-admin status returns value which is less than actual memory consumption as seen in the top. According to memory_total description ( http://docs.basho.com/riak/1.3.1/references/appendices/Inspecting-a-Node/) they should be equal. Why they are not? Alexander ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: memory consumption
Evan, News about per key overhead of 91 bytes are quite frustrating. When we were choosing a key value storage per key metadata size was a crucial point for us. We have a simple use case but a lot of data (hundreds of millions of items) so we were looking for the ways to reduce memory consumption. Here<http://docs.basho.com/riak/1.3.1/cookbooks/faqs/basics-faq/#is-there-a-limit-on-how-much-data-can-be-stored-in>and here<http://docs.basho.com/riak/1.3.1/cookbooks/faqs/developing-faq/#does-the-bucket-name-impact-key-storage-size>is stated a value of 40 bytes. 22 bytes in ram calculator<http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/>seemed like a mistake because the following example obviously uses a value of 40. Anyway, thanks for your response. On 4 August 2013 04:39, Evan Vigil-McClanahan wrote: > Some responses inline. > > On Fri, Aug 2, 2013 at 3:11 AM, Alexander Ilyin > wrote: > > Hi, > > > > I have a few questions about Riak memory usage. > > We're using Riak 1.3.1 on a 3 node cluster. According to bitcask capacity > > calculator > > ( > http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/ > ) > > Riak should use about 30Gb of RAM for out data. Actually, it uses about > 45Gb > > and I can't figure out why. I'm looking at %MEM column in top on each > node > > for a beam.smp process. > > I've recently done some research on this and have filed bugs against > the calculator, it's a bit wrong and has been that way for a while: > > https://github.com/basho/basho_docs/issues/467 > > The numbers there look a bit closer to what you're seeing. > > The good news is that I am looking into reducing memory consumption > this development cycle and our next release should see some > improvements on that front. The bad news is that it may be a while. > If you want to watch the bitcask repo on github to see when these > changes go in, it's usually pretty easy to build a new bitcask and > replace the one that you're running. > > > Disk usage is also about 1,5 times more than I have expected (270Gb > instead > > of 180Gb). I rechecked that I have n_val=2 (not 3), it seems alright. Why > > this could happen? > > There is definitely some overhead on the stored values, especially > when you're using bitcask. How big are your values? Overheads, if I > recall correctly, run to a few hundred bytes, but I'll have to ask > some people to refresh my memory. > > > Second question is about performance degradation when Riak uses almost > all > > available memory on the node. We see that 95/99 put percentiles twice as > > large for nodes which don't have much free RAM. How much free memory I > > should have to keep performance high? > > I don't have a good answer for this; when I was working as a CSE we > generally urged people to start adding nodes when their most limited > resource (memory, disk, cpu, etc) was 70-80% utilized (as a grossly > oversimplified rule of thumb). > > > And the last question about memory_total metric. riak-admin status > returns > > value which is less than actual memory consumption as seen in the top. > > According to memory_total description > > ( > http://docs.basho.com/riak/1.3.1/references/appendices/Inspecting-a-Node/) > > they should be equal. Why they are not? > > Top factors in OS/libc overheads that memory_total cannot see. I'll > check out the docs and get them amended if they're wrong. > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: memory consumption
So if you will succeed with all your patches the memory overhead will decrease by 22 (=16+4+2) bytes, am I right? On 5 August 2013 16:38, Evan Vigil-McClanahan wrote: > Before I'd done the research, I too thought that the overheads were a > much lower, near to what the calculator said, but not too far off. > > There are a few things that I plan on addressing this release cycle: > - 16b per-allocation overhead from using enif_alloc. This allows us > a lot of flexibility about which allocator to use, but I suspect that > since allocation speed isn't a big bitcask bottleneck, this overhead > simply isn't worth it. > - 13b per value overhead from naive serialization of the bucket/key > value. I have a branch that reduced this by 11 bytes. > - 4b per value overhead from a single bit flag that is stored in an > int. No patch for this thus far. > > Additionally, I've found that running with tcmalloc using LD_PRELOAD > reduces the cost for bitcask's many allocations, but a) I've never > done so in production and b) they say that it never releases memory, > which is worrying, although the paging system theoretically should > take care of it fairly easily as long as their page usages isn't > insane. > > My original notes looked like this: > > 1) ~32 bytes for the OS/malloc + khash overhead @ 50M keys > (amortized, so bigger for fewer keys, smaller for more keys). > 2) + 16 bytes of erlang allocator overhead > 3) + 22 bytes for the NIF C structure > 4) + 8 bytes for the entry pointer stored in the khash > 5) + 13 bytes of kv overhead > > tcmalloc does what it can for line 1. > My patches do what I can for lines 2, 3, and 5. > > 4 isn't amenable to anything other than a change in the way the keydir > is stored, which could also potentially help with 1 (fewer > allocations, etc). That, unfortunately, is not very likely to happen > soon. > > So things will get better relatively soon, but there are some > architectural limits that will be harder to address. > > On Mon, Aug 5, 2013 at 1:49 AM, Alexander Ilyin > wrote: > > Evan, > > > > News about per key overhead of 91 bytes are quite frustrating. When we > were > > choosing a key value storage per key metadata size was a crucial point > for > > us. We have a simple use case but a lot of data (hundreds of millions of > > items) so we were looking for the ways to reduce memory consumption. > > Here and here is stated a value of 40 bytes. 22 bytes in ram calculator > > seemed like a mistake because the following example obviously uses a > value > > of 40. > > > > Anyway, thanks for your response. > > > > > > On 4 August 2013 04:39, Evan Vigil-McClanahan > wrote: > >> > >> Some responses inline. > >> > >> On Fri, Aug 2, 2013 at 3:11 AM, Alexander Ilyin > >> wrote: > >> > Hi, > >> > > >> > I have a few questions about Riak memory usage. > >> > We're using Riak 1.3.1 on a 3 node cluster. According to bitcask > >> > capacity > >> > calculator > >> > > >> > ( > http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/ > ) > >> > Riak should use about 30Gb of RAM for out data. Actually, it uses > about > >> > 45Gb > >> > and I can't figure out why. I'm looking at %MEM column in top on each > >> > node > >> > for a beam.smp process. > >> > >> I've recently done some research on this and have filed bugs against > >> the calculator, it's a bit wrong and has been that way for a while: > >> > >> https://github.com/basho/basho_docs/issues/467 > >> > >> The numbers there look a bit closer to what you're seeing. > >> > >> The good news is that I am looking into reducing memory consumption > >> this development cycle and our next release should see some > >> improvements on that front. The bad news is that it may be a while. > >> If you want to watch the bitcask repo on github to see when these > >> changes go in, it's usually pretty easy to build a new bitcask and > >> replace the one that you're running. > >> > >> > Disk usage is also about 1,5 times more than I have expected (270Gb > >> > instead > >> > of 180Gb). I rechecked that I have n_val=2 (not 3), it seems alright. > >> > Why > >> > this could happen? > >> > >> There is definitely some overhead on the stored values, especially > >> when you're using bitcask
Re: memory consumption
Ok, thank you! Looking forward to the next release. On 6 August 2013 17:32, Evan Vigil-McClanahan wrote: > 11 + 4 + 16, so 31. > > 18 bytes there are the actual data, so that can't go away. Since the > allocation sized are going to be word aligned, the least overhead > there is going to be word aligning the entire structure, i.e. where > (key_len + bucket_len + 2 + 18) % 8 == 0, but that sort of > optimization only works with fixed length keys. > > Khash overheads are expected to be small-ish, but are > under-researched, at least by me. I suspect most of the overhead is > coming from the allocator. So moving to tcmalloc is a possible win > there, because it does a better job keeping amortized per-allocation > overheads low for small allocations than libc's malloc, but of course > with the caveats mentioned in my last email (tl;dr test > *exhaustively*, because we don't and likely won't). > > Another possible improvement would be to move to a fixed-length > structure (that points to an allocated location for oversized > key-bucket binaries, but that has a very bad pathological case where > someone selects all keys larger than your fixed size, where you have > the fixed len - 8 as an additional overhead. > > On Tue, Aug 6, 2013 at 2:56 AM, Alexander Ilyin > wrote: > > So if you will succeed with all your patches the memory overhead will > > decrease by 22 (=16+4+2) bytes, am I right? > > > > > > On 5 August 2013 16:38, Evan Vigil-McClanahan > wrote: > >> > >> Before I'd done the research, I too thought that the overheads were a > >> much lower, near to what the calculator said, but not too far off. > >> > >> There are a few things that I plan on addressing this release cycle: > >> - 16b per-allocation overhead from using enif_alloc. This allows us > >> a lot of flexibility about which allocator to use, but I suspect that > >> since allocation speed isn't a big bitcask bottleneck, this overhead > >> simply isn't worth it. > >> - 13b per value overhead from naive serialization of the bucket/key > >> value. I have a branch that reduced this by 11 bytes. > >> - 4b per value overhead from a single bit flag that is stored in an > >> int. No patch for this thus far. > >> > >> Additionally, I've found that running with tcmalloc using LD_PRELOAD > >> reduces the cost for bitcask's many allocations, but a) I've never > >> done so in production and b) they say that it never releases memory, > >> which is worrying, although the paging system theoretically should > >> take care of it fairly easily as long as their page usages isn't > >> insane. > >> > >> My original notes looked like this: > >> > >> 1) ~32 bytes for the OS/malloc + khash overhead @ 50M keys > >> (amortized, so bigger for fewer keys, smaller for more keys). > >> 2) + 16 bytes of erlang allocator overhead > >> 3) + 22 bytes for the NIF C structure > >> 4) + 8 bytes for the entry pointer stored in the khash > >> 5) + 13 bytes of kv overhead > >> > >> tcmalloc does what it can for line 1. > >> My patches do what I can for lines 2, 3, and 5. > >> > >> 4 isn't amenable to anything other than a change in the way the keydir > >> is stored, which could also potentially help with 1 (fewer > >> allocations, etc). That, unfortunately, is not very likely to happen > >> soon. > >> > >> So things will get better relatively soon, but there are some > >> architectural limits that will be harder to address. > >> > >> On Mon, Aug 5, 2013 at 1:49 AM, Alexander Ilyin > >> wrote: > >> > Evan, > >> > > >> > News about per key overhead of 91 bytes are quite frustrating. When we > >> > were > >> > choosing a key value storage per key metadata size was a crucial point > >> > for > >> > us. We have a simple use case but a lot of data (hundreds of millions > of > >> > items) so we were looking for the ways to reduce memory consumption. > >> > Here and here is stated a value of 40 bytes. 22 bytes in ram > calculator > >> > seemed like a mistake because the following example obviously uses a > >> > value > >> > of 40. > >> > > >> > Anyway, thanks for your response. > >> > > >> > > >> > On 4 August 2013 04:39, Evan Vigil-McClanahan > >> > wrote: > >> >> > >> >> Some resp
key expiration and memory
Hi, we're using Riak 1.3.1 with a Bitcask storage engine on a 4 node cluster. Properties of the bucket used: {"props":{"allow_mult":false," basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"c","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"one","rw":"quorum","small_vclock":50,"w":"one","young_vclock":20}} 3 nodes crashed today morning almost simultaneously, fourth node had crashed earlier this nignt. It's hard to figure out the reason of the crash looking into the logs, but I suppose this is due to insufficient RAM available on the servers (this maillist doesn't like big attachments but I can send logs if it will help). Actually, I'm dealing with Riak memory usage for some time already. Some time ago I figured out that memory usage is much more than it is promised in the documentation ( http://riak-users.197444.n3.nabble.com/memory-consumption-td4028674.html). After that I tried to reduce number of keys stored, changing the expiry_secs setting from 2 months to 1 month. I expected the memory usage to drop in half because new keys are put into Riak at a approximately same rate for last two months and Bitcask stores all keys in memory. But that didn't happen. Memory usage have dropped a little but grew again in a few days. So the questions are: * is there a cheap way to figure out the number of keys stored in a bucket? * how long does it take to purge old keys from the storage? * is it possible to see how many keys were expired for the last minute/hour/day? * are there some other ways to reduce memory usage I'm missing except for adding new servers? Thanks in advance. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: key expiration and memory
Dmitri, yes, I have AAE turned on and this discussion seems relevant to my case. As far as I have understood I have to wait for every tree to expire after changing expiry_secs setting (which should be less than a week). But according to 'riak-admin aee-status' entropy tree built time is less than 3 days for every partition while setting change had been made earlier. And only one node actually dropped its memory consumption by now. Am I missing something? May be turning AEE off (at least for some time) will help me to reduce memory consumption since no trees with expired keys at all would be built? On 28 August 2013 19:12, Dmitri Zagidulin wrote: > Alexander, quick question - do you have Active Anti-Entropy turned on? If > yes, check out this discussion: > http://riak-users.197444.n3.nabble.com/Active-Anti-Entropy-with-Bitcask-Key-Expiry-td4027688.html > (might shed more light on the matter). > > >> is there a cheap way to figure out the number of keys stored in a bucket? > No. You can do a streaming list keys, which is less expensive than listing > keys used to be, but I'm not sure that counts as cheap. > > >> how long does it take to purge old keys from the storage? > This depends on how often Bitcask merges are performed. The expired keys are > actually removed from disk when merging/compaction occurs (though until > then, they will return as not found to client requests). > > >> is it possible to see how many keys were expired for the last >> minute/hour/day? > I don't think so - Bitcask doesn't send any notification to Riak when it > expires the keys, and I don't think it keeps any records of its own. > > > > On Wed, Aug 28, 2013 at 4:56 AM, Alexander Ilyin > wrote: >> >> Hi, >> >> we're using Riak 1.3.1 with a Bitcask storage engine on a 4 node cluster. >> Properties of the bucket used: >> {"props":{"allow_mult":false," >> >> basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"c","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"one","rw":"quorum","small_vclock":50,"w":"one","young_vclock":20}} >> >> 3 nodes crashed today morning almost simultaneously, fourth node had >> crashed earlier this nignt. It's hard to figure out the reason of the crash >> looking into the logs, but I suppose this is due to insufficient RAM >> available on the servers (this maillist doesn't like big attachments but I >> can send logs if it will help). >> >> Actually, I'm dealing with Riak memory usage for some time already. Some >> time ago I figured out that memory usage is much more than it is promised in >> the documentation >> (http://riak-users.197444.n3.nabble.com/memory-consumption-td4028674.html). >> After that I tried to reduce number of keys stored, changing the expiry_secs >> setting from 2 months to 1 month. I expected the memory usage to drop in >> half because new keys are put into Riak at a approximately same rate for >> last two months and Bitcask stores all keys in memory. But that didn't >> happen. Memory usage have dropped a little but grew again in a few days. >> >> So the questions are: >> * is there a cheap way to figure out the number of keys stored in a >> bucket? >> * how long does it take to purge old keys from the storage? >> * is it possible to see how many keys were expired for the last >> minute/hour/day? >> * are there some other ways to reduce memory usage I'm missing except for >> adding new servers? >> >> Thanks in advance. >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: key expiration and memory
Hi all, we still experiencing problems with memory consumption, even thinking of switching to another storage :( Some time ago I turned off AAE and virtual memory usage on the servers dropped by 25%. I was surprised because AAE is turned on by default and it isn't said in the docs that it is such an expensive feature in terms of memory. But the physical memory was still close to what available on the servers and I decided to reduce expiry_secs. But that didn't affect physical memory consumption although one merge process has already finished. Questions: * Why memory usage didn't drop after reducing expiry_secs (AAE turned off)? * How I can ensure that merge process have enough time to delete old keys? May be it deletes them slower than new keys are added? * I estimate that quantity of new keys put in riak equal to old keys which should be expired; why memory constantly grows in this situation? In general, I want to know how to predict riak memory consumption (at least approximately). Because we aren't ready to buy new servers not knowing it. On 29 August 2013 14:44, Alexander Ilyin wrote: > Dmitri, > > yes, I have AAE turned on and this discussion seems relevant to my case. > As far as I have understood I have to wait for every tree to expire > after changing expiry_secs setting (which should be less than a week). > But according to 'riak-admin aee-status' entropy tree built time is > less than 3 days for every partition while setting change had been > made earlier. And only one node actually dropped its memory > consumption by now. Am I missing something? > > May be turning AEE off (at least for some time) will help me to reduce > memory consumption since no trees with expired keys at all would be > built? > > On 28 August 2013 19:12, Dmitri Zagidulin wrote: >> Alexander, quick question - do you have Active Anti-Entropy turned on? If >> yes, check out this discussion: >> http://riak-users.197444.n3.nabble.com/Active-Anti-Entropy-with-Bitcask-Key-Expiry-td4027688.html >> (might shed more light on the matter). >> >> >>> is there a cheap way to figure out the number of keys stored in a bucket? >> No. You can do a streaming list keys, which is less expensive than listing >> keys used to be, but I'm not sure that counts as cheap. >> >> >>> how long does it take to purge old keys from the storage? >> This depends on how often Bitcask merges are performed. The expired keys are >> actually removed from disk when merging/compaction occurs (though until >> then, they will return as not found to client requests). >> >> >>> is it possible to see how many keys were expired for the last >>> minute/hour/day? >> I don't think so - Bitcask doesn't send any notification to Riak when it >> expires the keys, and I don't think it keeps any records of its own. >> >> >> >> On Wed, Aug 28, 2013 at 4:56 AM, Alexander Ilyin >> wrote: >>> >>> Hi, >>> >>> we're using Riak 1.3.1 with a Bitcask storage engine on a 4 node cluster. >>> Properties of the bucket used: >>> {"props":{"allow_mult":false," >>> >>> basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"c","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"one","rw":"quorum","small_vclock":50,"w":"one","young_vclock":20}} >>> >>> 3 nodes crashed today morning almost simultaneously, fourth node had >>> crashed earlier this nignt. It's hard to figure out the reason of the crash >>> looking into the logs, but I suppose this is due to insufficient RAM >>> available on the servers (this maillist doesn't like big attachments but I >>> can send logs if it will help). >>> >>> Actually, I'm dealing with Riak memory usage for some time already. Some >>> time ago I figured out that memory usage is much more than it is promised in >>> the documentation >>> (http://riak-users.197444.n3.nabble.com/memory-consumption-td4028674.html). >>> After that I tried to reduce number of keys stored, changing the expiry_secs >>> setting from 2 months to 1 month. I expected the
Re: key expiration and memory
Justin, app.config in the attachment. On 4 September 2013 20:13, Justin Shoffstall wrote: > Alexander, > >> * How I can ensure that merge process have enough time to delete old >> keys? May be it deletes them slower than new keys are added? > > Would you please attach the app.config files from one of your nodes? I'm > particularly interested in the bitcask configuration section. > > Justin Shoffstall > jshoffst...@basho.com > > app.config Description: Binary data ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com