Hibernate Second-Level Cache

IMPORTANT NOTICES - PLEASE READ

Users of Ehcache and/or Terracotta Ehcache for Hibernate prior to Ehcache 2.0 should read:

Upgrade Notes for Ehcache versions prior to 2.0. These instructions are for Hibernate 3.

For older instructions on how to use Hibernate 2.1, please refer to: Guide for Version 1.1

Overview

Ehcache easily integrates with the Hibernate Object/Relational persistence and query service. Gavin King, the maintainer of Hibernate, is also a committer to the Ehcache project. This ensures Ehcache will remain a first class cache for Hibernate. Configuring Ehcache for Hibernate is simple. The basic steps are:

  • Download and install Ehcache into your project
  • Configure Ehcache as a cache provider in your project's Hibernate configuration.
  • Configure second-level caching in your project's Hibernate configuration.
  • Configure Hibernate caching for each entity, collection, or query you wish to cache.
  • Configure ehcache.xml as necessary for each entity, collection, or query configured for caching.

For more information regarding cache configuration in Hibernate see the Hibernate documentation.

Downloading and Installing Ehcache

The Hibernate provider is in the ehcache-core module. Download the latest version of the Ehcache core module. For Terracotta clustering, download a full Ehcache distribution.

Maven

Dependency versions vary with the specific kit you intend to use. Since kits are guaranteed to contain compatible artifacts, find the artifact versions you need by downloading a kit. Configure or add the following repository to your build (pom.xml):

<repository>
   <id>terracotta-releases</id>
   <url>http://www.terracotta.org/download/reflector/releases</url>
   <releases><enabled>true</enabled></releases>
   <snapshots><enabled>false</enabled></snapshots>
</repository>

Configure or add the the ehcache core module defined by the following dependency to your build (pom.xml):

<dependency>
   <groupId>net.sf.ehcache</groupId>
   <artifactId>ehcache-core</artifactId>
   <version>${ehcacheVersion}</version>
</dependency>

If you are configuring Hibernate and Ehcache for Terracotta clustering, add the following dependencies to your build (pom.xml):

<dependency>
   <groupId>net.sf.ehcache</groupId>
   <artifactId>ehcache-terracotta</artifactId>
   <version>${ehcacheVersion}</version>
</dependency>
<dependency>
   <groupId>org.terracotta</groupId>
   <artifactId>terracotta-toolkit-${toolkitAPIversion}-runtime</artifactId>
   <version>${toolkitVersion}</version>
</dependency>

Configure Ehcache as the Second-Level Cache Provider

To configure Ehcache as a Hibernate second-level cache, set the region factory property (for Hibernate 3.3 and above) or the factory class property (Hibernate 3.2 and below) to one of the following in the Hibernate configuration. Hibernate configuration is configured either via hibernate.cfg.xml, hibernate.properties or Spring. The format given is for hibernate.cfg.xml.

Hibernate 3.3 and higher

ATTENTION HIBERNATE 3.2 USERS: Make sure to note the change to BOTH the property name and value.

Use:

<property name="hibernate.cache.region.factory_class">
         net.sf.ehcache.hibernate.EhCacheRegionFactory</property>

for instance creation, or

<property name="hibernate.cache.region.factory_class">
         net.sf.ehcache.hibernate.SingletonEhCacheRegionFactory</property>

to force Hibernate to use a singleton of Ehcache CacheManager.

Hibernate 3.0 - 3.2

Use:

<property name="hibernate.cache.provider_class">
         net.sf.ehcache.hibernate.EhCacheProvider</property>

for instance creation, or

<property name="hibernate.cache.provider_class">
         net.sf.ehcache.hibernate.SingletonEhCacheProvider</property>

to force Hibernate to use a singleton Ehcache CacheManager.

Enable Second-Level Cache and Query Cache Settings

In addition to configuring the second-level cache provider setting, you will need to turn on the second-level cache (by default it is configured to off - 'false' - by Hibernate). This is done by setting the following property in your hibernate config:

<property name="hibernate.cache.use_second_level_cache">true</property>

You may also want to turn on the Hibernate query cache. This is done by setting the following property in your hibernate config:

<property name="hibernate.cache.use_query_cache">true</property>

Optional

The following settings or actions are optional.

Ehcache Configuration Resource Name

The configurationResourceName property is used to specify the location of the ehcache configuration file to be used with the given Hibernate instance and cache provider/region-factory. The resource is searched for in the root of the classpath. It is used to support multiple CacheManagers in the same VM. It tells Hibernate which configuration to use. An example might be "ehcache-2.xml". When using multiple Hibernate instances it is therefore recommended to use multiple non-singleton providers or region factories, each with a dedicated Ehcache configuration resource.

net.sf.ehcache.configurationResourceName=/name_of_ehcache.xml

Set the Hibernate cache provider programmatically {#Set the Hibernate cache provider programmatically}

The provider can also be set programmatically in Hibernate by adding necessary Hibernate property settings to the configuration before creating the SessionFactory:

Configuration.setProperty("hibernate.cache.region.factory_class",
                     "net.sf.ehcache.hibernate.EhCacheRegionFactory")

Putting it all together

If you are using Hibernate 3.3 and enabling both second-level caching and query caching, then your hibernate config file should contain the following:

<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.use_query_cache">true</property>
<property name="hibernate.cache.region.factory_class">net.sf.ehcache.hibernate.EhCacheRegionFactory</property>

An equivalent Spring configuration file would contain:

<prop key="hibernate.cache.use_second_level_cache">true</prop>
<prop key="hibernate.cache.use_query_cache">true</prop>
<prop key="hibernate.cache.region.factory_class">net.sf.ehcache.hibernate.EhCacheRegionFactory</prop>

Configure Hibernate Entities to use Second-Level Caching

In addition to configuring the Hibernate second-level cache provider, Hibernate must also be told to enable caching for entities, collections, and queries. For example, to enable cache entries for the domain object com.somecompany.someproject.domain.Country there would be a mapping file something like the following:

<hibernate-mapping>
<class
name="com.somecompany.someproject.domain.Country"
table="ut_Countries"
dynamic-update="false"
dynamic-insert="false"
>
...
</class>
</hibernate-mapping>

To enable caching, add the following element.

<cache usage="read-write|nonstrict-read-write|read-only" />

For example:

<hibernate-mapping>
<class
name="com.somecompany.someproject.domain.Country"
table="ut_Countries"
dynamic-update="false"
dynamic-insert="false"
>
 <cache usage="read-write" />
...
</class>
</hibernate-mapping>

This can also be achieved using the @Cache annotation, e.g.

@Entity
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Country {
...
}

Definition of the different cache strategies

read-only

Caches data that is never updated.

nonstrict-read-write

Caches data that is sometimes updated without ever locking the cache. If concurrent access to an item is possible, this concurrency strategy makes no guarantee that the item returned from the cache is the latest version available in the database. Configure your cache timeout accordingly!

read-write

Caches data that is sometimes updated while maintaining the semantics of "read committed" isolation level. If the database is set to "repeatable read", this concurrency strategy almost maintains the semantics. Repeatable read isolation is compromised in the case of concurrent writes.

Configure

Because ehcache.xml has a defaultCache, caches will always be created when required by Hibernate. However more control can be exerted by specifying a configuration per cache, based on its name. In particular, because Hibernate caches are populated from databases, there is potential for them to get very large. This can be controlled by capping their maxElementsInMemory and specifying whether to overflowToDisk beyond that. Hibernate uses a specific convention for the naming of caches of Domain Objects, Collections, and Queries.

Domain Objects

Hibernate creates caches named after the fully qualified name of Domain Objects. So, for example to create a cache for com.somecompany.someproject.domain.Country create a cache configuration entry similar to the following in ehcache.xml.

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <cache
name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
overflowToDisk="true"
 />
</ehcache>

Hibernate CacheConcurrencyStrategy

read-write, nonstrict-read-write and read-only policies apply to Domain Objects.

Collections

Hibernate creates collection caches named after the fully qualified name of the Domain Object followed by "." followed by the collection field name. For example, a Country domain object has a set of advancedSearchFacilities. The Hibernate doclet for the accessor looks like:

/**
* Returns the advanced search facilities that should appear for this country.
* @hibernate.set cascade="all" inverse="true"
* @hibernate.collection-key column="COUNTRY_ID"
* @hibernate.collection-one-to-many class="com.wotif.jaguar.domain.AdvancedSearchFacility"
* @hibernate.cache usage="read-write"
*/
public Set getAdvancedSearchFacilities() {
return advancedSearchFacilities;
}

You need an additional cache configured for the set. The ehcache.xml configuration looks like:

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <cache name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="50"
eternal="false"
timeToLiveSeconds="600"
overflowToDisk="true"
/>
 <cache
name="com.somecompany.someproject.domain.Country.advancedSearchFacilities"
maxElementsInMemory="450"
eternal="false"
timeToLiveSeconds="600"
overflowToDisk="true"
/>
</ehcache>

Hibernate CacheConcurrencyStrategy

read-write, nonstrict-read-write and read-only policies apply to Domain Object collections.

Queries

Hibernate allows the caching of query results using two caches. "net.sf.hibernate.cache.StandardQueryCache" and "net.sf.hibernate.cache.UpdateTimestampsCache" in versions 2.1 to 3.1 and "org.hibernate.cache.StandardQueryCache" and "org.hibernate.cache.UpdateTimestampsCache" in version 3.2 are always used.

StandardQueryCache

This cache is used if you use a query cache without setting a name. A typical ehcache.xml configuration is:

<cache
name="org.hibernate.cache.StandardQueryCache"
maxElementsInMemory="5"
eternal="false"
timeToLiveSeconds="120"
overflowToDisk="true"/>

UpdateTimestampsCache

Tracks the timestamps of the most recent updates to particular tables. It is important that the cache timeout of the underlying cache implementation be set to a higher value than the timeouts of any of the query caches. In fact, it is recommend that the the underlying cache not be configured for expiry at all. A typical ehcache.xml configuration is:

<cache
name="org.hibernate.cache.UpdateTimestampsCache"
maxElementsInMemory="5000"
eternal="true"
overflowToDisk="true"/>

Named Query Caches

In addition, a QueryCache can be given a specific name in Hibernate using Query.setCacheRegion(String name). The name of the cache in ehcache.xml is then the name given in that method. The name can be whatever you want, but by convention you should use "query." followed by a descriptive name. E.g.

<cache name="query.AdministrativeAreasPerCountry"
maxElementsInMemory="5"
eternal="false"
timeToLiveSeconds="86400"
overflowToDisk="true"/>

Using Query Caches

For example, let's say we have a common query running against the Country Domain. Code to use a query cache follows:

public List getStreetTypes(final Country country) throws HibernateException {
final Session session = createSession();
try {
   final Query query = session.createQuery(
    "select st.id, st.name"
   + " from StreetType st "
   + " where st.country.id = :countryId "
   + " order by st.sortOrder desc, st.name");
   query.setLong("countryId", country.getId().longValue());
   query.setCacheable(true);
   query.setCacheRegion("query.StreetTypes");
   return query.list();
} finally {
   session.close();
}
}

The query.setCacheable(true) line caches the query. The query.setCacheRegion("query.StreetTypes") line sets the name of the Query Cache. Alex Miller has a good article on the query cache here.

Hibernate CacheConcurrencyStrategy

None of read-write, nonstrict-read-write and read-only policies apply to Domain Objects. Cache policies are not configurable for query cache. They act like a non-locking read only cache.

Demo Apps

We have demo applications showing how to use the Hibernate 3.3 CacheRegionFactory.

Hibernate Tutorial

Check out from the Terracotta Forge.

Examinator

Examinator is our complete application that shows many aspects of caching, all using the Terracotta Server Array. Check out from the Terracotta Forge.

Performance Tips

Session.load

Session.load will always try to use the cache.

Session.find and Query.find

Session.find does not use the cache for the primary object. Hibernate will try to use the cache for any associated objects. Session.find does however cause the cache to be populated. Query.find works in exactly the same way. Use these where the chance of getting a cache hit is low.

Session.iterate and Query.iterate

Session.iterate always uses the cache for the primary object and any associated objects. Query.iterate works in exactly the same way. Use these where the chance of getting a cache hit is high.

How to Scale

Configuring each Hibernate instance with a standalone ehcache will dramatically improve performance. However most production applications use multiple application instances for redundancy and for scalability. Ideally applications are horizontally scalable, where adding more application instances linearly improves throughput. With an application deployed on multiple nodes, using standalone Ehcache means that each instance holds its own data.

On a cache miss on any node, Hibernate will read from the database. This generally results in N reads where N is the number of nodes in the cluster. As each new node gets added database workload goes up. Also, when data is written in one node, the other nodes are unaware of the data write, and thus subsequent reads of this data on other nodes will result in stale reads.

The solution is to turn on distributed caching or replicated caching.. Ehcache comes with native cache distribution using the following mechanism:

  • Terracotta

Ehcache supports the following methods of cache replication:

  • RMI
  • JGroups
  • JMS replication

Selection of the distributed cache or replication mechanism may be made or changed at any time. There are no changes to the application. Only changes to ehcache.xml file are required. This allows an application to easily scale as it grows without expensive re-architecting.

Configuring Ehcache for distributed caching using Terracotta

Ehcache provides built-in support for Terracotta distributed caching. The following are the key considerations when selecting this option:

  • Simple snap-in configuration with one line of configuration
  • Simple to scale up to as much performance as you need -- no application changes required
  • Wealth of "CAP" configuration options allow you to configure your cache for whatever it needs - fast, coherent, asynchronous updates, dirty reads etc.
  • The fastest coherent option for caches with reads and writes
  • Store as much data as you want - 20GB -> 1TB
  • Commercial products and support available from http://www.terracotta.org

Configuring Terracotta distributed caching is described in the Terracotta Documentation. A sample cache configuration is provided here:

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <terracottaConfig url="localhost:9510" />
 <cache
name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
overflowToDisk="true">
<terracotta/>
 </cache>
</ehcache>

Configuring Replicated Caching using RMI, JGroups, or JMS

Ehcache can use JMS, JGroups or RMI as a cache replication scheme. The following are the key considerations when selecting this option:

  • The consistency is weak. Nodes might be stale, have different versions or be missing an element that other nodes have. Your application should be tolerant of weak consistency.
  • session.refresh() should be used to check the cache against the database before performing a write that must be correct. This can have a performance inmpact on the database.
  • Each node in the cluster stores all data, thus the cache size is limited to memory size, or disk if disk overflow is selected.

Configuring for RMI Replication

RMI configuration is described in the Ehcache User Guide - RMI Distributed Caching. A sample cache configuration (using automatic discovery) is provided here:

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <cacheManagerPeerProviderFactory
  class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=automatic, multicastGroupAddress=230.0.0.1,
  multicastGroupPort=4446, timeToLive=32"/>
 <cache
name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
overflowToDisk="true">
<cacheEventListenerFactory
       class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
 </cache>
</ehcache>

Configuring for JGroups Replication

Configuraging JGroups replication is described in the Ehcache User Guide - Distributed Caching with JGroups. A sample cache configuration is provided here:

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.jgroups
.JGroupsCacheManagerPeerProviderFactory"
properties="connect=UDP(mcast_addr=231.12.21.132;mcast_port=45566;ip_ttl=32;
mcast_send_buf_size=150000;mcast_recv_buf_size=80000):
PING(timeout=2000;num_initial_members=6):
MERGE2(min_interval=5000;max_interval=10000):
FD_SOCK:VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):
UNICAST(timeout=5000):
pbcast.STABLE(desired_avg_gossip=20000):
FRAG:
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;
shun=false;print_local_addr=true)"
propertySeparator="::"
/>
 <cache
name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
overflowToDisk="true">
<cacheEventListenerFactory
class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"
properties="replicateAsynchronously=true, replicatePuts=true,
replicateUpdates=true, replicateUpdatesViaCopy=false,
replicateRemovals=true" />
 </cache>
</ehcache>

Configuring for JMS Replication

Configuring JMS replication is described in the Ehcache User Guide - JMS Distributed Caching. A sample cache configuration (for ActiveMQ) is provided here:

    <?xml version="1.0" encoding="UTF-8"?>
<ehcache>
 <cacheManagerPeerProviderFactory
       class="net.sf.ehcache.distribution.jms.JMSCacheManagerPeerProviderFactory"
       properties="initialContextFactoryName=ExampleActiveMQInitialContextFactory,
           providerURL=tcp://localhost:61616,
           topicConnectionFactoryBindingName=topicConnectionFactory,
           topicBindingName=ehcache"
       propertySeparator=","
       />
 <cache
name="com.somecompany.someproject.domain.Country"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
overflowToDisk="true">
<cacheEventListenerFactory
     class="net.sf.ehcache.distribution.jms.JMSCacheReplicatorFactory"
     properties="replicateAsynchronously=true,
                  replicatePuts=true,
                  replicateUpdates=true,
                  replicateUpdatesViaCopy=true,
                  replicateRemovals=true,
                  asynchronousReplicationIntervalMillis=1000"
      propertySeparator=","/>
 </cache>
</ehcache>

FAQ

Should I use the provider in the Hibernate distribution or in Ehcache?

Since Hibernate 2.1, Hibernate has included an Ehcache CacheProvider. That provider is periodically synced up with the provider in the Ehcache Core distribution. New features are generally added in to the Ehcache Core provider and then the Hibernate one.

What is the relationship between the Hibernate and Ehcache projects?

Gavin King and Greg Luck cooperated to create Ehcache and include it in Hibernate. Since 2009 Greg Luck has been a committer on the Hibernate project so as to ensure Ehcache remains a first-class 2nd level cache for Hibernate.

Does Ehcache support the new Hibernate 3.3 2nd level caching SPI?

Yes. Ehcache 2.0 supports this new API.

Does Ehcache support the transactional strategy?

Yes. It was introduced in Ehcache 2.1.

Is Ehcache Cluster Safe?

hibernate.org maintains a table listing the providers. While ehcache works as a distributed cache for Hibernate, it is not listed as "Cluster Safe". What this means is that `Hibernate's lock and unlock methods are not implemented. Changes in one node will be applied without locking. This may or may not be a noticeable problem. In Ehcache 1.7 when using Terracotta, this cannot happen as access to the clustered cache itself is controlled with read locks and write locks. In Ehcache 2.0 when using Terracotta, the lock and unlock methods tie-in to the underlying clustered cache locks. We expect Ehcache 2.0 to be marked as cluster safe in new versions of the Hibernate documentation.

How are Hibernate Entities keyed?

Hibernate identifies cached Entities via an object id. This is normally the primary key of a database row.

Can you use Identity mode with the Terracotta Server Array

You cannot use identity mode clustered cache with Hibernate. If the cache is exclusively used by Hibernate we will convert identity mode caches to serialization mode. If the cache cannot be determined to be exclusively used by Hibernate (i.e. generated from a singleton cache manager) then an exception will be thrown indicating the misconfigured cache. Serialization mode is in any case the default for Terracotta clustered caches.

I get org.hibernate.cache.ReadWriteCache - An item was expired by the cache while it was locked error messages. What is it?

Soft locks are implemented by replacing a value with a special type that marks the element as locked, thus indicating to other threads to treat it differently to a normal element. This is used in the Hibernate Read/Write strategy to force fall-through to the database during the two-phase commit - since we don't know exactly what should be returned by the cache while the commit is in process (but the db does). If a soft-locked Element is evicted by the cache during the 2 phase commit, then once the 2 phase commit completes the cache will fail to update (since the soft-locked Element was evicted) and the cache entry will be reloaded from the database on the next read of that object. This is obviously non-fatal (we're a cache failure here so it should not be a problem). The only problem it really causes would I imagine be a small rise in db load. So, in summary the Hibernate messages are not problematic. The underlying cause is the probabilistic evictor can theoretically evict recently loaded items. This evictor has been tuned over successive ehcache releases. As a result this warning will happen most often in 1.6, less often in 1.7 and very rarely in 1.8. You can also use the deterministic evictor to avoid this problem. Specify the java -Dnet.sf.ehcache.use.classic.lru=true system property to turn on classic LRU which contains a deterministic evictor.

I get java.lang.ClassCastException: org.hibernate.cache.ReadWriteCache$Item incompatible with net.sf.ehcache.hibernate.strategy.AbstractReadWriteEhcacheAccessStrategy$Lockable

This is the tell-tale error you get if you are:

  • using a Terracotta cluster with Ehcache
  • you have upgraded part of the cluster to use net.sf.ehcache.hibernate.EhCacheRegionFactory but part of it is still using the old SPI of EhCacheProvider.
  • you are upgrading a Hibernate version Ensure you have changed all nodes and that you clear any caches during the upgrade.

Are compound keys supported?

For standalone caching yes. With Terracotta a larger amount of memory is used.

Why do I not see replicated data when using nonstrict mode?

You may think that Hibernate's <nonstrict> mode is just like <read-write> but with dirty reads. The truth is far more complex than that. Suffice to say, in <nonstrict> mode, Hibernate puts the object in the appropriate cache but then IMMEDIATELY removes it. The PUT and the REMOVE are BOTH replicated by ehcache so the net effect of that is the new object is copied to remote cache but then it's immediately followed by a replicated remove so the next time you try get the object it's not in cache and hibernate goes back to the DB. So, practically there is no point using nonstrict mode with replicated or distributed caches. If you want the updated entry to be replicated or distributed use <readwrite> or <transactional>.