Recently I've been using adapted code of CrawlerSessionManagerValve from
Tomcat 7 on Tomcat 6 (JBossWeb 2.1.4 respectively). It works well.
However I observed a lot of HTTP sessions that were created even if
CrawlerSessionManagerValve logged sucessfull bot detection. Finally I
found that problem is in the way CrawlerSessionManagerValve removes
expired session infos. See
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.tomcat/tomcat-catalina/7.0.14/org/apache/catalina/valves/CrawlerSessionManagerValve.java#CrawlerSessionManagerValve.backgroundProcess%28%29.
The problem occurs in case of bot makes pause between requests that
lasts at least "sessionInactiveInterval" and is not shorter than
"sessionInactiveInterval" + 60 seconds. In this situation the real HTTP
session is already destroyed but corresponding SessionInfo still exists.
CrawlerSessionManagerValve detects bot and finds SessionInfo. New HTTP
session is created but CrawlerSessionManagerValve just changes
sessionInfo lastAccessed property (because sessionInfo != null).
The problem code is in method backgroundProcess() where expireTime is
resolved: (sessionInactiveInterval + 60) * 1000. It's possible that
those +60 seconds have some meaning nevertheless they may cause creation
of hundreds of useless HTTP sessions. My opinion is that
backgroundProcess() is not the right way to remove expired session
infos. I guess that more suitable solution is to make
CrawlerSessionManagerValve implementor of
org.apache.catalina.SessionListener and remove expired session info on
SESSION_DESTROYED_EVENT. I tried to modify the code and it seems that
CrawlerSessionManagerValve works now even better.
Martin Kouba
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org