JBoss.org Community Documentation

8.3.4. S3CacheLoader

The S3CacheLoader uses the Amazon S3 (Simple Storage Solution) for storing cache data. Since Amazon S3 is remote network storage and has fairly high latency, it is really best for caches that store large pieces of data, such as media or files. But consider this cache loader over the JDBC or file system based cache loaders if you want remotely managed, highly reliable storage. Or, use it for applications running on Amazon's EC2 (Elastic Compute Cloud).

If you're planning to use Amazon S3 for storage, consider using it with JBoss Cache. JBoss Cache itself provides in-memory caching for your data to minimize the amount of remote access calls, thus reducing the latency and cost of fetching your Amazon S3 data. With cache replication, you are also able to load data from your local cluster without having to remotely access it every time.

Note that Amazon S3 does not support transactions. If transactions are used in your application then there is some possibility of state inconsistency when using this cache loader. However, writes are atomic, in that if a write fails nothing is considered written and data is never corrupted.

Data is stored in keys based on the Fqn of the Node and Node data is serialized as a java.util.Map using the CacheSPI.getMarshaller() instance. Read the javadoc on how data is structured and stored. Data is stored using Java serialization. Be aware this means data is not readily accessible over HTTP to non-JBoss Cache clients. Your feedback and help would be appreciated to extend this cache loader for that purpose.

With this cache loader, single-key operations such as Node.remove(Object) and Node.put(Object, Object) are the slowest as data is stored in a single Map instance. Use bulk operations such as Node.replaceAll(Map) and Node.clearData() for more efficiency. Try the cache.s3.optimize option as well.