Via this post I found Fiddler, which is a wonderfully easy way to see what IE is requesting. No need to use a proxy; you can see the requests as they happen. Simple install, which puts a fiddler button on the IE toolbar. Very useful for debugging. I haven’t tried it with IE7 and there was no mention of it on the Fiddler pages.
Destroying robot generated Tomcat sessions
A large effort goes into creating sites that are crawlable by robots, such as Google, Yahoo! and other search engines. However, these programs can create a large number of sessions, if the site is based on servlet technology. Per the servlet spec (the 2.3 specification, page 50), if a client never joins a session, new sessions will be created for each request.
A session is considered new when it is only a prospective session and has not been established. Because HTTP is a request-response based protocol, an HTTP session is considered to be new until a client joins it. A client joins a session when session tracking information has been returned to the server indicating that a session has been established. Until the client joins a session, it cannot be assumed that the next request from the client will be recognized as part of a session.The session is considered to be new if either of the following is true:
- The client does not yet know about the session
- The client chooses not to join a session.
These conditions define the situation where the servlet container has no mechanism by which to associate a request with a previous request.
Since all these extra sessions take up memory, and are long lived, a client asked me to look into a way to invalidate them. (I’m not the first person to run into this problem.) The easiest way to do that was to build a filter that examined the User-Agent
HTTP header; here’s a nice list of User-Agent
values. If the client was any of the robots, we could safely invalidate the session. For some reason, in with Tomcat 4.1, I needed to run session.isNew();
before running session.invalidate();
, otherwise the session wasn’t destroyed. The filter was placed at the end of the request chain, as outlined in this article, by calling chain.doFilter(request, response);
before the invalidation filter looked at the request or response.
I haven’t seen any performance problems with creating a session and then throwing it away, probably because java is so good at garbage collecting short lived objects. If I did, conditionally disabling session participation in a JSP might be an option to pursue.