Components and Concepts
SimplePageCachingFilter
The SimplePageCachingFilter is a simple caching filter suitable for caching compressible HTTP responses such as HTML, XML or JSON. It uses a singleton CacheManager created with the default factory method. Override to use a different CacheManager.
It is suitable for:
complete responses (i.e. not fragments).
A content type suitable for gzipping. For example, text or text/html.
Keys
Pages are cached based on their key. The key for this cache is the URI followed by the query string. An example is /admin/SomePage.jsp?id=1234&name=Beagle. This key technique is suitable for a wide range of uses. It is independent of host name and port number, so will work well in situations where there are multiple domains which get the same content, or where users access based on different port numbers. A problem can occur with tracking software, where unique ids are inserted into request query strings. Because each request generates a unique key, there will never be a cache hit. For these situations it is better to parse the request parameters and override calculateKey(javax.servlet.http.HttpServletRequest) with an implementation that takes account of only the significant ones.
Configuring the Cache Name
A cache entry in ehcache.xml should be configured with the name of the filter. Names can be set using the init-param cacheName, or by sub-classing this class and overriding the name.
Concurrent Cache Misses
A cache miss will cause the filter chain, upstream of the caching filter to be processed. To avoid threads requesting the same key to do useless duplicate work, these threads block behind the first thread. The thread timeout can be set to fail after a certain wait by setting the init-param blockingTimeoutMillis. By default threads wait indefinitely. In the event upstream processing never returns, eventually the web server may get overwhelmed with connections it has not responded to. By setting a timeout, the waiting threads will only block for the set time, and then throw a net.sf.ehcache.constructs.blocking.LockTimeoutException. Under either scenario an upstream failure will still cause a failure.
Gzipping
Significant network efficiencies, and page loading speedups, can be gained by gzipping responses. Whether a response can be gzipped depends on:
Whether the user agent can accept GZIP encoding. This feature is part of HTTP1.1.
If a browser accepts GZIP encoding it will advertise this by including in its HTTP header: All common browsers except IE 5.2 on Macintosh are capable of accepting gzip encoding. Most search engine robots do not accept gzip encoding.
Whether the user agent has advertised its acceptance of gzip encoding. This is on a per request basis. If they will accept a gzip response to their request they must include the following in the HTTP request header:
Accept-Encoding: gzip
Responses are automatically gzipped and stored that way in the cache. For requests which do not accept gzip encoding the page is retrieved from the cache, ungzipped and returned to the user agent. The ungzipping is high performance.
Caching Headers
The SimpleCachingHeadersPageCachingFilter extends SimplePageCachingFilter to provide the HTTP cache headers: ETag, Last-Modified and Expires. It supports conditional GET. Because browsers and other HTTP clients have the expiry information returned in the response headers, they do not even need to request the page again. Even once the local browser copy has expired, the browser will do a conditional GET. So why would you ever want to use SimplePageCachingFilter, which does not set these headers?
The answer is that in some caching scenarios you may wish to remove a page before its natural expiry. Consider a scenario where a web page shows dynamic data. Under Ehcache the Element can be removed at any time. However if a browser is holding expiry information, those browsers will have to wait until the expiry time before getting updated. The caching in this scenario is more about defraying server load rather than minimizing browser calls.
Init-Params
The following init-params are supported:
cacheName - the name in ehcache.xml used by the filter.
blockingTimeoutMillis - the time, in milliseconds, to wait for the filter chain to return with a response on a cache miss. This is useful to fail fast in the event of an infrastructure failure.
varyHeader - set to true to set Vary:Accept-Encoding in the response when doing Gzip. This header is needed to support HTTP proxies however it is off by default.
<init-param>
<param-name>varyHeader</param-name>
<param-value>true</param-value>
</init-param>
Re-entrance
Care should be taken not to define a filter chain such that the same CachingFilter class is reentered. The CachingFilter uses the BlockingCache. It blocks until the thread which did a get which results in a null does a put. If reentry happens a second get happens before the first put. The second get could wait indefinitely. This situation is monitored and if it happens, an IllegalStateException will be thrown.
SimplePageFragmentCachingFilter
The SimplePageFragmentCachingFilter does everything that SimplePageCachingFilter does, except it never gzips, so the fragments can be combined. There is a variant of this filter which sets browser caching headers, because that is only applicable to the entire page.