WebSphere® eXtreme Scale is most
often used as a shared cache, to provide transactional access to data
to multiple components where a traditional database would otherwise
be used. The shared cache eliminates the need configure a database.
The cache is coherent because all of the clients see the same data
in the cache. Each piece of data is stored on exactly one server in
the cache, preventing wasteful copies of records that could potentially
contain different versions of the data. A coherent cache can also
hold more data as more servers are added to the grid, and scales linearly
as the grid grows in size. Because clients access data from this grid
with remote procedural calls, it can also be known as a remote cache
(or far cache). Through data partitioning, each process holds a unique
subset of the total data set. Larger grids can both hold more data
and service more requests for that data. Coherency also eliminates
the need to push invalidation data around the grid because there is
no stale data. The coherent cache only holds the latest copy of each
piece of data.
If you are running a
WebSphere Application Server environment,
the TranPropListener plug-in is also available. The TranPropListener
plug-in uses the high availability component (HA Manager) of
WebSphere Application Server to propagate
the changes to each peer ObjectGrid cache instance.
Figure 1. Distributed
cache
Near cache
Clients can optionally have a
local, in-line cache when
eXtreme Scale is
used in a distributed topology. This optional cache is called a near
cache, an independent ObjectGrid on each client, serving as a cache
for the remote, server-side cache. The near cache is enabled by default
when locking is configured as optimistic or none and cannot be used
when configured as pessimistic.
A near cache is very fast because it provides in-memory
access to a subset of the entire cached data set that is stored remotely
in the
eXtreme Scale servers.
The near cache is not partitioned and contains data from any of the
remote
eXtreme Scale partitions.
WebSphere eXtreme Scale can have up to three
cache tiers as follows.
- The transaction tier cache contains all changes for a single transaction.
The transaction cache contains a working copy of the data until the
transaction is committed. When a client transaction requests data
from an ObjectMap, the transaction is checked first
- The near cache in the client tier contains a subset of the data
from the server tier. When the transaction tier does not have the
data, the data is fetched from the near cache if available and inserted
into the transaction cache
- The grid in the server tier contains the majority of the data
and is shared among all clients. The server tier can be partitioned,
which allows a large amount of data to be cached. When the client
near cache does not have the data, it is fetched from the server tier
and inserted into the client cache. The server tier can also have
a Loader plug-in. When the grid does not have the requested data,
the Loader is invoked and the resulting data is inserted from the
backend data store into the grid.
To disable the near cache, set the numberOfBuckets attribute
to 0 in the client override eXtreme Scale descriptor configuration.
See the topic on map entry locking for details on eXtreme Scale lock
strategies. The near cache can also be configured to have a separate
eviction policy and different plug-ins using a client override eXtreme Scale descriptor configuration.
Advantage
- Fast response time because all access to the data is local.
Disadvantages
- Increases duration of stale data.
- Must use an evictor to invalidate data to avoid running out of
memory.
When to use
Use when response time is important
and stale data can be tolerated.
Embedded cache
eXtreme Scale grids can run within
existing processes as embedded eXtreme Scale servers or can be
managed as external processes. Embedded grids are useful when you
are running in an application server, such as WebSphere Application Server. You can start
eXtreme Scale servers that are not embedded by using command line
scripts and run in a Java™ process.
Advantages
- Simplified administration since there are less processes to manage.
- Simplified application deployment since the grid is using the
client application's classloader.
- Support partitioning and high availability.
Disadvantages
- Increased the memory footprint in client process since all of
the data is collocated in the process.
- Increase CPU utilization for servicing client requests.
- More difficult to handle application upgrades since clients are
using the same application Java archive
files as the servers.
- Less flexible. Scaling of clients and grid servers cannot increase
at the same rate. When servers are externally defined, you can have
more flexibility in managing the number of processes.