-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kyle,

On 9/7/12 12:19 PM, kharp...@oreillyauto.com wrote:
> Chris:
>> Assembling the sessions into a Collection is likely to be very
>> fast, since it's just copying references around: the size of the
>> individual sessions should not matter. Of course, pushing all
>> those bytes to the other servers...
> 
>> Perhaps Tomcat does something like serialize the session to a
>> big binary structure and then sends that (which sounds insane --
>> streaming binary data is how that should be done -- but I haven't
>> checked to code to be sure).
> 
> It appears that tomcat is serializing all the data into a singular 
> structure, rather than a collection of references.

:(

> Watching VisualVM plot heap  usage during replication, it nearly
> doubles (in my test env, this was the only thing running so that
> makes sense).

That certainly sound like Tomcat is chewing-through a lot of heap
space. Without understanding the implementation, I can't comment on
whether or not that is really necessary, but I have to imagine that a
streaming-serialization (or at least buffered, where one session is
serialized at a time and then streamed) would be superior.

> If you're sure Tomcat is only making references, then I'd propose
> there is a problem with the JVM dereferencing the collection
> elements and double-counting the memory used.

It's very unlikely that the JVM is making that kind of mistake. Also,
I haven't looked at a single line of Tomcat's session-distribution
code so I'm not in a position to make accurate assertions about its
implementation.

> Either way, it's enough to make the JVM report a doubling of heap
> usage and a raise to the heap allocation.  As soon as replication
> is done, heap use goes back to normal.  I've attached a screenshot
> to the zip file.

Sounds like your analysis is reasonable. I'll look at your data and
make further comments.

In the meantime, I've heard over the years one particular thing from
people I feel know what they are talking about: don't use HttpSession.

It's not that the servlet spec's session-management isn't useful, it's
that it is very hard to make it scale properly when using most
container-managed clustering implementations. First, it almost always
uses Java Serialization which is not terribly efficient. Second, it is
very coarse-grained: you replicate the entire session or the whole
thing fails (unless you use some vendor-specific tricks like Rainer's
suggestion of suppressing certain attributes). If you have a lot of
data in the session that doesn't *need* to be replicated (like caches,
etc.) then you might be wasting a lot of time.

So, what to use instead? Memcached comes up often as a similar
solution to the problem of clustered temporary data storage. Why
memcached instead of HttpSession clustering? I think about half of the
answer is that when you have to manually push your own data to a
remote cache (instead of allowing the container to handle the
details), you tend to get very frugal with the amount of data you end
up pushing.

We don't use distributed sessions on our product even though (almost)
everything in the session is serializable. Even so, I've thought of
writing a wrapper for certain objects (like caches) that we store in
the session that never need to be replicated (at least not in their
complete forms). I haven't done it because, well, we don't need it.
Stick-sessions with failover is good enough for us, and we don't have
to saturate our network with session-replication chatter.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlBKN9oACgkQ9CaO5/Lv0PB3LACgsrVWsuWWkb0ckfIPeiNUMoq4
8fcAoIb0FQU/2EsET1AmIHGkX20si4lG
=xKjd
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to