> From: Christopher Schultz [mailto:[EMAIL PROTECTED] > Subject: Re: Recovery from OutOfMemoryError?
(Sorry for not responding sooner. Went out to dinner and to see the Spider Pig movie :-) > Actually, my past experience has been that it's the GC > thread that OOMEs, not a worker thread. Assuming we're talking about a current HotSpot-based JVM, the threads doing GCs cannot get OOMEs, since they are dedicated to doing just GC operations, and never do any object allocations themselves. On older JVMs (and some from other vendors), the thread that initially encounters an allocation failure also does the GC; if the GC fails to recover enough memory, it can generate an OOME for itself. > It has always been my understanding that a JVM that suffers an OOME > is all but done for. The JVM itself doesn't care about any exceptions thrown at the application. There are certainly a ton of applications that handle such error conditions very badly, and hang themselves up by doing such things as trying to display messages rather than nulling out now useless references. Some of the stress-testing of our JVM involves running apps designed to provoke OOMEs; these readily recover and keep on truckin'. > The OP would seem to corroborate this claim, since it sounds like his > whole app server becomes unresponsive once he gets an OOME (hence the > early morning phone calls). The supposed timing of the phone calls leaves me somewhat skeptical; what are they running where the peak load occurs at 3 AM? > If your assertion (OOMEs can be ignored, since only one allocation > fails and the rest of the VM is fine) were true, then the OP would > not be getting any calls in the middle of the night: the user would > simply re-try the request and (hopefully) get a result the second time. That's not what I said at all. Each logical module should be designed to handle such situations, typically by discarding what has been done up to the point of failure, and then returning an error to its caller. What is likely to have happened instead in the OP's case is that the app encountering the OOME had no provision at all for error recovery, and simply quit, leaving many now useless objects around with live references to them. It may have even made matters worse by trying to generate an error message of some sort. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]