Distributed cache

WebSphere® eXtreme Scale is most often used as a shared cache, to provide transactional access to data to multiple components where a traditional database would otherwise be used. The shared cache eliminates the need configure a database.

The cache is coherent because all of the clients see the same data in the cache. Each piece of data is stored on exactly one server in the cache, preventing wasteful copies of records that could potentially contain different versions of the data. A coherent cache can also hold more data as more servers are added to the grid, and scales linearly as the grid grows in size. Because clients access data from this grid with remote procedural calls, it can also be known as a remote cache (or far cache). Through data partitioning, each process holds a unique subset of the total data set. Larger grids can both hold more data and service more requests for that data. Coherency also eliminates the need to push invalidation data around the grid because there is no stale data. The coherent cache only holds the latest copy of each piece of data.

If you are running a WebSphere Application Server environment, the TranPropListener plug-in is also available. The TranPropListener plug-in uses the high availability component (HA Manager) of WebSphere Application Server to propagate the changes to each peer ObjectGrid cache instance.

Figure 1. Distributed cache

Near cache

Clients can optionally have a local, in-line cache when eXtreme Scale is used in a distributed topology. This optional cache is called a near cache, an independent ObjectGrid on each client, serving as a cache for the remote, server-side cache. The near cache is enabled by default when locking is configured as optimistic or none and cannot be used when configured as pessimistic.

Figure 2. Near cache

A near cache is very fast because it provides in-memory access to a subset of the entire cached data set that is stored remotely in the eXtreme Scale servers. The near cache is not partitioned and contains data from any of the remote eXtreme Scale partitions.WebSphere eXtreme Scale can have up to three cache tiers as follows.

The transaction tier cache contains all changes for a single transaction. The transaction cache contains a working copy of the data until the transaction is committed. When a client transaction requests data from an ObjectMap, the transaction is checked first
The near cache in the client tier contains a subset of the data from the server tier. When the transaction tier does not have the data, the data is fetched from the near cache if available and inserted into the transaction cache
The grid in the server tier contains the majority of the data and is shared among all clients. The server tier can be partitioned, which allows a large amount of data to be cached. When the client near cache does not have the data, it is fetched from the server tier and inserted into the client cache. The server tier can also have a Loader plug-in. When the grid does not have the requested data, the Loader is invoked and the resulting data is inserted from the backend data store into the grid.

To disable the near cache, set the numberOfBuckets attribute to 0 in the client override eXtreme Scale descriptor configuration. See the topic on map entry locking for details on eXtreme Scale lock strategies. The near cache can also be configured to have a separate eviction policy and different plug-ins using a client override eXtreme Scale descriptor configuration.

Advantage

Fast response time because all access to the data is local.

Disadvantages

Increases duration of stale data.
Must use an evictor to invalidate data to avoid running out of memory.

When to use

Use when response time is important and stale data can be tolerated.

Embedded cache

eXtreme Scale grids can run within existing processes as embedded eXtreme Scale servers or can be managed as external processes. Embedded grids are useful when you are running in an application server, such as WebSphere Application Server. You can start eXtreme Scale servers that are not embedded by using command line scripts and run in a Java™ process.

Figure 3. Embedded cache

Advantages

Simplified administration since there are less processes to manage.
Simplified application deployment since the grid is using the client application's classloader.
Support partitioning and high availability.

Disadvantages

Increased the memory footprint in client process since all of the data is collocated in the process.
Increase CPU utilization for servicing client requests.
More difficult to handle application upgrades since clients are using the same application Java archive files as the servers.
Less flexible. Scaling of clients and grid servers cannot increase at the same rate. When servers are externally defined, you can have more flexibility in managing the number of processes.

When to use

Use embedded grids when there is plenty of memory free in the client process for grid data and potential failover data.

For more information, see Enabling the client invalidation mechanism.