-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 10/9/2009 5:29 PM, André Warnier wrote:
> Caldarale, Charles R wrote:
> Being ever eager to learn, I have been following this thread with interest.
> 
> Do I understand this correctly, if I draw the following conclusions :
> 
> - the Heap is a global structure, managed by the JVM which runs Tomcat

Yes. The heap is not segmented by ClassLoader or anything like that. The
only thing that separates web apps from each other is the ClassLoader
hierarchy.

> - webapps create (instantiate) objects by using classes, which are
> pieces of code which (among other things) create objects. Such objects
> are allocated on the Heap.

Hmm.... classes are data declarations + code which operate on that data.
Any code can create objects, and objects are instances of classes. "The
heap" is an entity whose definition can bet very blurry when it gets
down to the dirty details of things. It's better to think of "the heap"
as a concept that has certain behaviors. Specifically, objects allocated
on the heap survive until they are explicitly removed (by the garbage
collector, so it's kind of implicit) rather than being created or
removed automatically when methods begin or end (which is what happens
when objects are allocated on the stack rather than the heap).

I'll head Chuck off at the pass and state that, in Java, all objects
always behave as if they are allocated in "the heap" and never as if
they were allocated in "the stack". Local primitive values (of which
object references, or pointers if you prefer) declared in methods always
behave as if they were allocated on the stack.

The reality is that the JVM and, more specifically, the JIT, are free to
do all kinds of crazy things like allocate objects in the stack, heap,
or a combination of the two in order to accomplish their tasks in the
most efficient way possible. Objects routinely move around within the
heap (which may be surprising to some C programmers out there) for
various reasons. The JVM/JIT can basically do anything it wants as long
as the "behave as if" rules are followed.

As I continue, I'll be describing the "behaves as if" behavior rather
than the reality, which is always a bit cloudy and you usually don't
have to worry about.

> - instances (copies) of classes are loaded into JVM memory (where?) on
> an "as-needed" base, for example the first time a webapp invokes some
> piece of code in the class

The class itself is loaded on demand by the ClassLoader. When a piece of
code references either the Class itself, a static member of the class,
or tries to create an instance of that class, then the ClassLoader loads
it (unless it was already loaded). When the class is loaded, the code is
physically loaded into memory and possibly compiled immediately
(depending upon JVM settings), objects are created to describe the class
(i.e. you'll get a java.lang.Class object and also objects representing
Method, Field, Constructor, etc. if appropriate). The Class, Method,
etc. objects are loaded (as all objects are) into the heap. The code
itself is stored elsewhere and inaccessible to regular Java code.

PermGen is a place where Class objects often end up because they tend to
live so long. But, there's no requirement that they be loaded directly
into that heap segment, nor that they remain there.

> - a class instance can be loaded from
>     - either a location private/belonging to a particular webapp
> (WEB-INF/classes/*.class or WEB-INF/lib/*.jar)
>     - or a location common to all webapps, such as
> Tomcat_dir/shared/classes or Tomcat_dir/shared/lib/*.jar

Technically speaking, the class instances are not loaded from anywhere.
The classes are loaded and instances are created. A class is like a
recipe for a cake, and an instance is like an individual cake. The class
can be loaded from anywhere. It doesn't even need to come from a file
per se: there are compilers that compile directly to bytecode while the
JVM is running. ANTLR can be used to compile arbitrary source languages
(as long as you've got the grammar) directly into bytecode consumable by
a ClassLoader. I think there's also an Apache commons project that lets
you hand-assemble Java bytecode and stuff it into a running ClassLoader
if you really want to do that.

> - The JVM "remembers" where a class instance was loaded from, so that
> for example an instance of class A loaded from
> webapp-1/WEB-INF/lib/abc.jar is distinct from an instance of class A
> loaded from webapp-2/WEB-INF/lib/abc.jar, and both are different from an
> instance of class A loaded from Tomcat_dir/shared/lib/abc.jar

Forget the word 'instance' here because it's confusing. The definition
of a class as far as the JVM is concerned is ClassLoader +
filly-qualified class name. That means that if webappA's ClassLoader
loaded, say, foo.bar.Baz, and so did webappB's ClassLoader (even if they
cane from the same JAR file), they are distinct. I'm not entirely sure
if the JVM remembers the source JAR/.class file where the class itself
came from, but the ClassLoader hierarchy keeps them separate.

> - it would be a bad idea anyway to have abc.jar located in a 
> webapp-x/WEB-INF/lib and simultaneously in Tomcat_dir/shared/lib. 
> (Why this is a bad idea is not very clear to me if the above holds
> true, but I trust previous communications here saying that it is a
> bad idea)

Practically speaking, yes. Theoretically, this shouldn't be a problem if
those classes are exclusively used within the webapp, because the
webapp's ClassLoader should load abc.jar and all its classes and they
will always be used within that webapp. The problem is if the webapp
requests an object from, say, Tomcat's internals, then the classes
owning the instances will be different (remember: class = ClassLoader +
FQCN) and you'll get a ClassCastException. There is essentially no way
to resolve the impedance between the "same" class loaded twice in two
different ClassLoaders unless you do a ton of reflection-based coding,
and everyone hates that crap.

> - an object always holds a reference to the class it was created from

Yes!

> - a class instance generally does not, but can, keep a reference to the
> objects created from it. Class instances which create a singleton object
> perforce keep a reference to it.

I'm unable to discern in this context what you mean by "instance".
Classes are not created from classes. java.lang.Class objects represent
classes, and they have references to the ClassLoader that loaded them.
Classes with singleton objects (by which I presume you mean that they
have a single, static member of their own type) only coincidentally have
a reference to the object instance created by that Class. The JVM does
not maintain a list of instances that came from a particular class (for
instance, you can't ask java.lang.String for a list of all String
objects allocated in the JVM). That is, the references only go one way:
from the object (instance) to the defining class.

> - a class instance can be unloaded from memory when
>     - the webapp which loaded it is itself unloaded, and all objects of
>       that class created by (or belonging to) that webapp are thus
>       destroyed
>     - AND the class instance does not contain any reference to any
>       other object(s) created by (an)other webapp(s)

The webapp itself is irrelevant. A java.lang.Class object (and therefore
any code, static members, etc.) can only be dumped from memory when the
ClassLoader that loaded it is not referenced by any live object and no
live objects are referring to it.

There is an important implication buried in that last statement: if you
store a single object in some semi-global scope, that can keep /all/
Class objects, etc., from being unloaded from memory even after the
webapp has been undeployed. This is because that one object contains a
reference to the Class that defines it, that Class contains a reference
to the ClassLoader that loaded it, and the ClassLoader contains a list
of all Classes it ever loaded. So, that one object can keep the whole
tree of objects loaded in the heap forever. So, remember to clean up
your memory messes: garbage collection isn't a silver bullet.

> In other words, if a class instance was loaded from a jar in
> Tomcat_dir/shared/lib, then
>     - that class instance would be shared by all webapps referencing the
> class, and would only be allocated once in memory (?) (thus saving
> memory space)

Yes, if you remove the word "instance" from that whole section.

>     - but that class instance could not be unloaded (and maybe replaced
> by another better version) until all objects created by it, on behalf of
> any webapp, have been destroyed. In the practice, this could mean that
> it is only possible to unload and reload this class instance by stopping
> and restarting the entire JVM (and Tomcat).

This is true only because Tomcat offers no way to discard and re-load
its shared ClassLoader. The JVM would be perfectly happy to discard all
of Tomcat's shared libraries and re-load them assuming Tomcat discarded
its own ClassLoader and then re-created it.

> Thus, if one is confident that all webapps are compatible with the same
> version of some classes, and if these classes do not contain class-level
> variables or allocate singleton objects whose common usage by different
> webapps may lead to trouble, and if one never intends to unload/reload a
> single webapp at a time and always brings down and restarts the whole
> Tomcat at once, one might as well put the classes in Tomcat_dir/shared.

Yes. The problem is that those requirements are often difficult to
manage and maintain. For instance, if you have decided to share your
JDBC library among all webapps and then one webapp requires a newer
version but the other ones require an older version, you are screwed and
have to undo all that merging of libraries in order to separate-out the
versions. (This was a bad example, because Tomcat basically requires
that your JDBC library be in a shared ClassLoader because Tomcat itself
creates DataSources using its own ClassLoaders and not with the webapp's
ClassLoader, but the point is the same).

> And if in doubt about any of the above, put them in each webapp's
> WEB-INF and buy more RAM if necessary.

My recommendation is that if your webapp needs a library in order to
run, you ought to include that library in the WAR file (or exploded WAR
directory structure). Webapps are supposed to be self-contained, so you
can simply deploy a WAR file into an app server. If you first have to
figure out what libraries are missing and install them in a shared
location, you've lost a great amount of convenience.

That's just my perspective.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrPtfUACgkQ9CaO5/Lv0PC2xwCfZSQl995XyX20CYZBPBL36QC2
3OcAn3hEYCygNBJhiC5uaCfDhh1RB2zX
=a/dG
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to