Recently I've been using adapted code of CrawlerSessionManagerValve from Tomcat 7 on Tomcat 6 (JBossWeb 2.1.4 respectively). It works well. However I observed a lot of HTTP sessions that were created even if CrawlerSessionManagerValve logged sucessfull bot detection. Finally I found that problem is in the way CrawlerSessionManagerValve removes expired session infos. See http://grepcode.com/file/repo1.maven.org/maven2/org.apache.tomcat/tomcat-catalina/7.0.14/org/apache/catalina/valves/CrawlerSessionManagerValve.java#CrawlerSessionManagerValve.backgroundProcess%28%29.

The problem occurs in case of bot makes pause between requests that lasts at least "sessionInactiveInterval" and is not shorter than "sessionInactiveInterval" + 60 seconds. In this situation the real HTTP session is already destroyed but corresponding SessionInfo still exists. CrawlerSessionManagerValve detects bot and finds SessionInfo. New HTTP session is created but CrawlerSessionManagerValve just changes sessionInfo lastAccessed property (because sessionInfo != null).

The problem code is in method backgroundProcess() where expireTime is resolved: (sessionInactiveInterval + 60) * 1000. It's possible that those +60 seconds have some meaning nevertheless they may cause creation of hundreds of useless HTTP sessions. My opinion is that backgroundProcess() is not the right way to remove expired session infos. I guess that more suitable solution is to make CrawlerSessionManagerValve implementor of org.apache.catalina.SessionListener and remove expired session info on SESSION_DESTROYED_EVENT. I tried to modify the code and it seems that CrawlerSessionManagerValve works now even better.

Martin Kouba

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to