Damon L. Chesser wrote: > Patrick Ouellette wrote: > >Rephrase the question. Ask what the intended use of the machine is, > >what response times the users expect, how much real RAM is there, and > >what applications/services will run off the machine.
This is a good response. This would be necessary for me to answer the question. But understanding is still needed to answer well. > >Then tell the interviewer how each parameter you've asked about would > >influence your decision on how much swap was enough. As an interviewer I would then propose several different cases and see how the thought processes went in each of those cases. The person being interviewed would still be in the hot seat to answer but they would have gotten good points for knowing that it was a discussion session and not a strict question-answer session. Personally I am looking for someone with good general problem solving skills and not necessarily someone who can solve the particular problems I am posing to them. No one knows all of the answers. Making use of available resources is important and don't forget that at that moment your interviewer is one of your available resources. > >How much is enough? As much as the system needs to run and not kill > >processes due to lack of memory (real + swap). > > > >I've run machines with 1Gig or more RAM with NO SWAP. I've also run > >machines with 4Gig of RAM and 16Gig of swap (BIG datasets). Each in their own place. But why? What is the formula? Unfortunately while there are some formula there is also a need for judgement. There is an old adage, "Good judgement comes from experience. Experience comes from bad judgement." The old rule was twice as much swap as ram. Then with Linux on small disks that seemed harsh and we started to use less swap. How was using less swap on Linux possible when before on Unix it wasn't? Now the rule is back again. But why? I know why in my case because it comes from a bad experience where I set up servers for large memory applications and problems ensued. Keep reading... I will get to why in a moment. :-) > "At a bare minimum, you need an appropriately-sized root partition, and > a swap partition equal to twice the amount of RAM" For an enterprise quality server where reliability matters I fully agree. Yes. You want twice your ram. But you also want to set vm.overcommit_memory=2 for reliability. > That is RHEL's take on the issue. Still looking for other sources. > Interesting, I think. Notice that RHEL includes "Enterprise" in their name and they are catering to the enterprise market. The problem isn't that the system will slow down to an unusable state if it actually starts using that much swap. It would. But that isn't the problem. The problem is that if there isn't enough swap that Linux will find itself unable to fulfil memory commitments and will invoke the OOM (out-of-memory) Killer to kill processes until it can meet commitments. The OOM Killer makes systems unreliable. Avoiding the OOM killer is needed for highly reliable systems such as in use in the enterprise environment. By default Linux will overcommit memory allocation. The malloc(3) library can't fail. The fork(2) system call can't fail. But this means that if there isn't enough memory in the future that Linux may need to kill something. That works similarly to a 'kill -9', the process simply stops running. There is no ability of the process to log the event to a log file. It just disappears. How does the Linux kernel choose which process dies? It guesses based upon a set of heuristics. Sometimes it guesses right. Sometimes it guesses wrong. I have personally experienced it killing my X server while I was using it due to this. That wasn't very pleasant. Now after learning about this I always disable overcommit. This restores a traditional Unix process model. Programs that call for memory (through the malloc() interface or fork(), program level interfaces that use memory) will see failures if there isn't enough memory. This is a Good Thing as it allows applications to handle issues and log them generally deal with the problem. Back to the original question: Configure virtual memory to be twice RAM? Or twice the maximum total size of all processes? Really it is the latter. But RAM sets the capability and so does the amount of swap so we tune those together to set the maximum capability of the system even if we are not using it. I learned about this because I didn't know it when I set up a computer server pool to be used for some large memory consuming simulations. They all had 16G of RAM with 64-bit process space. I knew that if they started swapping that they would run to slowly for us and the plan was to avoid swapping. If they don't swap they don't need swap space and so I didn't configure any on them. I came from a Unix background and didn't expect this different behavior when running on Linux. We started to see processes randomly stop running. Nothing was ever logged from the log files for those processes. Large memory applications never recorded being out of memory. They just stopped running. A number of folks started cursing Linux for being unreliable and at that moment for us it was very unreliable! It was from that experience I learned about the Linux memory overcommit and oom killer behavior. And with that knowledge I knew that I needed to rebuild the computer server pool with overcommit off and enough VM. Summary: So in my mind an Important Aspect of system configuration is Reliability and in order to get Reliability it is needed to turn off Linux kernel memory overcommit in order to avoid the dreaded and often misguided OOM Killer. This is the important point but it affects the amount of virtual memory required by a system and that in turn affects the need for having swap space configured. With overcommit turned off then the rule for twice the amount of RAM is again required. (In other words you don't need swap / VM configured equal to twice RAM if vm.overcommit_memory=0 as is the Linux default. But then you must deal with the OOM killer behavior.) [ Sidebar: This is how you disable overcommit on a system. This can be done temporarily with the following command. sudo sysctl -w vm.overcommit_memory=2 HOWEVER! If you do not have enough virtual memory / swap space at the time that you activate this configuration then it may cause your system to be unable to fork() and malloc() starting at that moment. If the system becomes unusable due to fork failure you may need to reboot. You must have enough virtual memory configured to handle all of your expected simultaneously running applications. Be careful that you can reboot your server remotely if you do not have physical access. You have been warned! Since the above 'sysctl' test is a temporary configuration a reboot will reset to the default allowing memory overcommit and enabling the out of memory killer if the system runs out of memory and needs a reboot to be saved. If you decide that is a good configuration to keep memory overcommit disabled then it may be configured in /etc/sysctl.conf to be set at boot time. This is what I recommend and I always configure my servers this way. ] For more information on this topic search the web for linux kernel out of memory killer and memory overcommit and you will find much discussion about it. Start here: http://lwn.net/Articles/104179/ (a must read article) http://linux-mm.org/OverCommitAccounting http://www.redhat.com/magazine/001nov04/features/vm/ http://lists.debian.org/debian-user/2007/08/msg00022.html [1] Bob [1] Full disclosure: This is a previous posting of mine on this topic.
signature.asc
Description: Digital signature