|
Abstract |
This guide to tuning IBM® WebSphere® Application Server
dynamic caching and the data replication service (DRS) can help you
improve the performance of your Web solutions. |
|
|
|
|
Content |
Purpose of the document and introduction
This document will guide WebSphere Application Server customers that use
dynamic caching and data replication service (DRS) through the various
tuning options and guidelines that are available. This document refers to
dynamic caching as the WebSphere Application Server component that is
responsible for providing the dynamic caching service of the Web container
for servlet and Java™ Server Pages (JSP) caching, and the object level
caching available through object cache instances. This component also
drives the Distributed MAP (DMAP) API available in the WebSphere
Application Server Enterprise offering. This document assumes that you are
familiar with defining the cache policies using the cachespec.xml file,
configuring the cache service, and the basic usage contexts of dynamic
caching and DRS. This document builds on this base and guides the
administrator and the application architect through guidelines for
defining effective cache policies, and provides a reference for tuning
dynamic cache and DRS in a production environment.
Dynamic caching policies
The caching policy that is set up for the application is critical in
contributing to the savings in response time and providing better end user
experience. Take care to specify the policies so that the correct content
is served out of the cache, and recognize that it is not beneficial to
cache all content indiscriminately. Be aware that caching should be used
as a mechanism for improving performance and scalability of the solution
and not as a fix to mask problems with the application or infrastructure.
The cost of regenerating a response, in terms of CPU cycles that are
needed and the critical resources that are accessed (such as the number of
database queries that are executed), should be weighed against the
reusability of the response within the window of time when the response
will be valid. The reusability of the object should also be considered in
terms of whether the object is specific to a user, session, store, or if
it is a site wide or publicly reusable object.
A worksheet like the one that follows can form the basis of categorizing
the candidate content to be cached and help in determining the
effectiveness of caching:
Object to be
cached |
Category |
|
|
|
Degrees of connectivity
Dependencies / Variations
/ Relationships |
|
|
|
Average size of object |
|
|
|
Cost of generating the
object |
Response generation time |
|
|
|
Critical resource access |
|
|
|
I/O wait time |
|
|
|
Validity of the
object |
Expiration / TTL |
|
|
|
Invalidation rate |
|
|
|
Popularity |
Frequency of access |
|
|
|
Reusability (User) / Session
/ Store / Public) |
|
|
|
Business Value |
Relative Importance |
|
|
|
Dynamic cache tuning is much like tuning any other performance-enhancing
component and is an iterative process. It should begin at application
design, with guidelines from the application architect on what can and
should not be cached. This is typically based on input from the
requirement stage and knowledge of the application scenarios. This process
is further refined through the development, validation, and production
phase of the project. It is invaluable during validation and
pre-production phases of development to monitor and collect data to
understand and rectify the impact of cache policies and tuning on system
behavior.
It is possible that you will need to run some projected workload without
caching to determine values such as the cost of generating the object.
Otherwise, you will have to rely on the intuition and experience of the
application architect.
Guidelines for determining the effectiveness of caching should take into
account the following:
- The cost of generating a response should be greater than
the maximum cache access time, where the maximum cache access time should
factor in overhead for disk access, distribution policy, and so on.
- The lower the validity of the object and response, the
more likely that it will not be reused. This can result in larger
latencies due to cache misses and cleanup overhead, than by simply not
caching the object.
- The objects with more popularity and business value should
be assigned a higher relative priority.
- The higher the degrees of connectivity of an object, the
more costly it is to invalidate and evict the object from the cache. Take
this into account when determining where to cache the object in terms of
keeping it in the memory cache, disk cache, or distributing the object
across the cluster.
The dynamic cache specification provides attributes that can be used to
declare properties of the cached object such as timeout (in seconds),
priority (LOWEST PRIORITY = 0, HIGHEST PRIORITY = 16), and inactivity, to
affect the treatment of these cached objects.
Memory caching
Dynamic cache accesses and retrieves objects primarily from the memory
cache. This cache keeps references to the cached objects and can be
configured with limits on the number of entries that will be cached in
memory. After the limit of entries that are specified for the memory cache
is reached, adding additional entries in the cache will require that
entries be evicted out of memory. Eviction of entries is based on how
recently the evicted entry was last accessed, and the priority of the
object that is inserted into the cache.
Choosing the size of the memory cache, in terms of the number of entries,
should be done based on how much memory is available for caching. The
average memory, in bytes, that is used by the system to reference a cached
object with its dependency IDs can be computed as the average size of the
object + the average size of the cache ID + k * ( the number of templates
+ dependency IDs that are associated with this object + 128) where k is 4
for 32-bit platforms and 8 for 64-bit platforms. The number of entries
that are specified should be large enough to hold the cache entries that
are associated with the popular or more frequently used categories. The
memory cache, and therefore the memory dedicated for the cache, should be
large enough to not only cache content belonging to categories that have
higher business value, but also enough additional entries to form a
working set in order to minimize the amount of thrashing due to Least
Recent Used (LRU) eviction.
The Java Virtual Machine (JVM™) heap settings should also be set. The
recommended setting for the JVM heap is to have 40% of free heap after
caching. This tuning involves either increasing the size of JVM or
reducing the size of the in-memory cache (or cache objects that require
less memory). There are lots of trade offs here such as higher JVM causing
longer GC. It is a fine balance that can only be determined with proper
testing.
The cache attempts to clean up the expired entries from the memory cache
in the background. By default, the daemon responsible for this cleanup
will wake up every five seconds. This is sufficient for most deployments.
On the other hand, this can probably be set higher for deployments that do
have infrequent invalidation and possibly invalidate entries once a day.
Again, if the deployment has a lot of automated or trigger-driven
invalidation, this should be set lower.
Disk caching
Dynamic cache provides the option to cache content in disk when the
content is evicted from the memory cache. It is highly recommended that
the off-load directory be located on a separate disk or partition that is
dedicated for caching. This enables better response times for the disk
cache through reduced contention for disk space with application data and
code on the file system where WebSphere Application Server is installed.
The partition should be sized to be at least twice the expected volume of
cached content.
The storage and access of objects from disk involves the serialization
and deserialization of objects. This feature comes at a higher cost, and
should be taken into consideration when deciding what content should be
persisted to disk. It is possible to selectively cache content to the disk
through cache policies that are defined in the cachespec.xml file, in
particular the persist-to-disk property.
Disk cache cleanup and tuning
Objects that are in the disk cache are cleaned up when they are
explicitly invalidated through either programmatic or policy-based
invalidations, or when the objects expire. The process of cleaning up
objects from the cache consists of updating the tables that host the
dependency ID to cache ID mappings and template ID to cache ID mappings,
in addition to freeing up disk space to the internal storage manager. The
available space on the file system does not increase after the objects are
deleted from the cache, as the space is reused by the storage manager so
that it can be reused by other objects that are cached to the disk.
The disk cache cleanup is done in the background as a low priority thread
to reduce contention for the disk from active request and response
threads. The time to perform this cleanup, as reported in the logs, tracks
the duration of the scan. With the low priority of the scan, it can take
several minutes.
You can activate the disk cache cleanup once a day at a specified time by
using the com.ibm.ws.cache.CacheConfig.htodCleanupHour system property To
set any system property in the Application Server
- In the console, click through Application servers -> <your
server> -> Process Definition -> Java Virtual Machine ->
custom properties
- Click the 'New' button and declare the system property as the key and
its value in the value field, which defaults to 0 (12:00 midnight), or you
can specify the cleanup to run at a specific frequency (in minutes) by
using the com.ibm.ws.cacheCacheConfig.htodCleanupFrequency system
property. The disk cache cleanup occurs in two phases: scan and delete. In
the scan phase, the algorithm identifies objects that have expired on
disk. Since the cleanup algorithm is looking only for expired entries,
cached objects without an expiration value (an expiration value of 0) will
always remain on disk until explicitly invalidated The policy of never
expiring objects should be reconsidered if disk space is an issue in the
deployment. The delete phase returns disk space to the internal storage
manager and ensures that all references to the object are correctly
purged. Most large deployments that have a large amount of content on the
disk typically choose to specify that cleanup occurs at a frequency that
ranges from 30 minutes to a couple of hours, depending on the average
expiration time of content in the cache.
You can optimize the disk cache cleanup for disk I/O by buffering the
metadata that is associated with cached objects in memory. These auxiliary
buffers can hold the dependency and template information for the objects
so the object deletion time is decreased. Turn on this optimization by
setting the com.ibm.ws.cache.CacheConfig.htodDelayOffload system property
to true. You can tune the memory that is utilized by this optimization by
setting the com.ibm.ws.cache.CacheConfig.htodDelayOffloadEntiresLimit
system property to a value that specifies the maximum number of cache IDs
that any dependency ID can map to in the auxiliary buffer. Any dependency
that maps to more cache IDs than those specified using the
htodDelayOffloadEntriesLimit are not buffered and are written to disk.
Large deployments prefer to set this value to a value that approximates
the total number of entries in the entire cache for optimal performance.
For more details related to Disk Cache Enhancement, please see the
Technote for Disk Cache Enhancements:
Dynamic_Cache/swg27007969.html
Dynamic Cache replication using DRS
There are three primary replication settings for dynamic cache that
control the amount and type of information, including the object name, the
object value, and invalidation messages, that flows between servers:
- NOT_SHARED
- SHARED_PUSH
- SHARED_PUSH_PULL.
With all share types, object invalidation messages are always sent to
other servers to ensure that outdated information is never served to a
user. In the case of SHARED_PUSH, the cached object and its ID are sent to
all servers in the replication domain at the time that the object is
placed in cache. This makes the object immediately available to the
applications on other servers. It also speeds up application server
performance at the expense of greater network traffic and additional I/O
churn, in the case of objects that are cached in disk. With
SHARED_PUSH_PULL, the cached object is kept locally to the server that
created it, but the cache ID is shared with other servers. If a remote
server needs the object, it requests the object by name from the creating
server. With the NOT_SHARED policy, no objects or IDs are shared with the
server, except when invalidated.
The NOT_SHARED policy is adequate for most cache deployments. You can use
the SHARED_PUSH_PULL policy to optimize the performance by fetching the
object from another server in the cluster at the cost of additional
latency in the response time for the first miss. The object is cached
locally so subsequent accesses are serviced locally. Use the SHARED_PUSH
policy with care and only for specific objects that meet the criteria of
requiring no additional latency for the first access and have the property
of being infrequently invalidated.
An additional limitation with the SHARED_PUSH policy is that DRS has a
size limit for Dynamic Cache batch updates to cached content and pushes
them out to the cluster being replicated. The batch size defaults to 5 MB
and is updated by setting the system property MAX_MESSAGE_SIZE to the size
required. Set the system property with care, since it has implications on
how fast DRS can replicate objects. If the maximum update size is
increased then the replication domain time-out also needs to be increased
to allow for the transfer of the larger objects. If an update exceeds this
maximum size, the update is dropped and the objects referenced within the
update will go out of sync with the rest of the cluster members.
In the PUSH replication mode WebSphere Application Server Dynamic Cache
sends DRS messages that are large, which frequently causes the JVMs in a
clustered environment to exhaust their heaps, resulting in OOM errors and
heap dumps. We have fixed this in APAR PK32201 with a fix that makes
Dynamic Cache batch these messages, sending only a few cache entries at a
time in a message, resulting in smaller objects and helping these OOM
issues. This DRS batch size can now be configured using the following
custom properties
- com.ibm.ws.cache.CacheConfig.cachePercentageWindow:
Specifies a limit on the number of cache entries sent by DRS in terms of
the percentage of total cache in memory. Default value: 2% of the number
of entries in the cache Scope: configurable per cache instance
- com.ibm.ws.cache.CacheConfig.cacheEntryWindow: Specifies a
limit on the total number of cache entires sent by DRS in terms of number
of entries. Default value: 50 entries Scope: configurable per cache
instance
Before the PK32201 fix, all pushed entries were sent in one DRS message.
Now, they are sent in batches determined by the above two properties,
which default to 2 and 50, respectively. The least of the two values will
be used to determine the batch update size in the PUSH replication mode.
In most cases the default values for batchUpdateMilliseconds,
cachePercentageWindow and cacheEntryWindow will suffice; however, in
extreme cases the cacheEntryWindow needs to be set as low as 1 or 2
entries.
Dynamic Cache has also provided a way to control the frequency of the
updates Dynamic Cache sends to DRS using the
com.ibm.ws.cache.CacheConfig.batchUpdateMilliseconds custom property. This
property specifies the batch update interval in milliseconds. This
property applies to all cache instances irrespective of the replication
mode. Reducing batchUpdateMilliseconds results in Dynamic Cache sending
updates, and processing invalidations and new entries more frequently,
which will reduce the overall DRS payload size. However, reducing
batchUpdateMilliseconds also results in adding extra CPU processing
overhead. Default value: 1000ms
DRS and replicators
In WebSphere Application Server V5.0 and V5.1, DRS uses replicators,
defined in the Internal Replication Domains panel, to replicate objects
across the cluster. Every application server does not need a replicator
defined on the same node. The recommended policy is to have 1 replicator
per 4 application servers. Divide the total number of application servers
by 4 and plan to make that many replicators. If there are 16 application
servers, then 4 replicators would be desired. Since these replicators
would be managing the workload of 4 application servers, it would be best
to configure the replicators on 4 systems that are not a part of the
cluster. This way the replicator systems will be dedicated to replication
of cache and the application servers will be dedicated to servicing
application requests.
Use the following instructions to configure replicators if needed.
- Create a cluster using the cluster wizard.
- Check the box to create a replication domain, but
- Leave the replicator box unchecked for all members added to the
cluster.
- Click Internal Replication Domains >
your_domain > Replicator Entries > New
- Fill in the following configuration:
Replicator name: Any string identifying this replicator
Available server: Choose one nodes that will not be in the cluster
Hostname: This is the hostname of the node
Replicator port: Choose an unused port - our default is 7974
Client port: Choose an unused port - our default is 7973
- Click OK and Save.
Note: The replicator and client ports must not be the same.
- Repeat step 2 for each replicator.
- Enable cache replication on each Application Server in the cluster:
- Click Application Servers > your_server >
Dynamic Cache Service > Enable cache replication
- Ensure the following for "Internal messaging server" configuration:
Domain: Domain created through cluster wizard
Replicator: Choose one of the replicators
- Click OK and Save.
- Repeat step 4 for each member in the cluster.
Nuggets from the field
- Make dynamic cache part of design. Talk with business
users and architects.
- Discuss invalidation requirement scenarios and
frequency.
- Test for exceptions and privacy scenarios. Make sure
nothing unexpected is cached.
- Create cache specific test scenarios that are different
than function test.
- Monitor cache statistics in live site. Tuning is a
continuous task.
- Stay connected with latest dynamic cache fixes. Improve
stability, performance, and gain new features.
_____________________________________________________________________________
1 Be aware that caching should be used as a mechanism for
improving performance and scalability of the solution and not as a fix to
mask problems with the application or infrastructure.
2 The average memory, in bytes, that is used by the system to
reference a cached object with its dependency IDs can be computed as the
average size of the object + the average size of the cache ID + k * ( the
number of templates + dependency IDs that are associated with this object
+ 128) where k is 4 for 32-bit platforms and 8 for 64-bit platforms.
3 To set any system property in the Application Server
a) In the console, click through Application servers -> <your
server> -> Process Definition ->
Java Virtual Machine -> custom properties
b) Click the 'New' button and declare the system property as the key and
its value in the value field 4 The policy of never expiring
objects should be reconsidered if disk space is an issue in the
deployment.
5 Dynamic cache batches updates to cached content and pushes
them out to the cluster |
|
|
|
|
Cross Reference information |
Segment |
Product |
Component |
Platform |
Version |
Edition |
Application Servers |
WebSphere Application Server |
Data Replication Services |
|
|
|
Application Servers |
Runtimes for Java Technology |
Java SDK |
|
|
|
|
|
|