With WebSphere® eXtreme Scale,
your architecture can use local in-memory data caching or distributed
client-server data caching.
WebSphere eXtreme Scale requires
minimal additional infrastructure to operate. The infrastructure consists
of scripts to install, start, and stop a Java™ Platform,
Enterprise Edition application on a server.
Cached data is stored in the eXtreme Scale server, and clients
remotely connect to the server.
Distributed caches offer increased performance, availability
and scalability and can be configured using dynamic topologies, in
which servers are automatically balanced. You can also add additional
servers without restarting your existing eXtreme Scale servers. You can
create either simple deployments or large, terabyte-sized deployments
in which thousands of servers are needed.
Maps
A map is a container for key-value pairs,
which allows an application to store a value indexed by a key. Maps
support indexes that can be added to index attributes on the key or
value. These indexes are automatically used by the query runtime to
determine the most efficient way to run a query.
A map set is a collection of maps with a common partitioning
algorithm. The data within the maps are replicated based on the policy
defined on the map set. A map set is only used for distributed topologies
and is not needed for local topologies.
A map set can have a schema associated with it. A schema is
the metadata that describes the relationships between each map when
using homogeneous Object types or entities.
eXtreme Scale can store serializable Java objects in each of the maps
using the ObjectMap API. A schema can be defined over the maps to
identify the relationship between the objects in the maps where each
map holds objects of a single type. Defining a schema for maps is
required to query the contents of the map objects. eXtreme Scale can have multiple
map schemas defined. For more information,
see ObjectMap API.
eXtreme Scale can also store entities
using the EntityManager API. Each entity is associated with a map.
The schema for an entity map set is automatically discovered using
either an entity descriptor XML file or annotated Java classes. Each entity has a set of key attributes
and set of non-key attributes. An entity can also have relationships
to other entities. eXtreme Scale supports
one to one, one to many, many to one and many to many relationships.
Each entity is physically mapped to a single map in the map set. Entities
allow applications to easily have complex object graphs that span
multiple Maps. A distributed topology can have multiple entity schemas. For more information, see EntityManager API.
Containers, partitions, and shards
The container
is a service that stores application data for the grid. This data
is generally broken into parts, which are called partitions, and hosted
across multiple containers. Each container in turn hosts a subset
of the complete data. A JVM might host one or more containers and
each container can host multiple shards.
Remember: Plan
out the heap size for the containers, which host all of your data.
Configure the heap settings accordingly.
Partitions host a subset of the data in the grid.
eXtreme Scale automatically places
multiple partitions in a single container and spreads the partitions
out as more containers become available.
Important: Choose
the number of partitions carefully before final deployment as the
number of partitions cannot be changed dynamically. A hash mechanism
is used to locate partitions in the network and there is no way for
ObjectGrid to rehash the entire data set after it has been deployed.
Overestimate the number of partitions.
Shards are instances of partitions and have one of two roles:
primary or replica. The primary shard and its replicas make up the
physical manifestation of the partition. Every partition has several
shards that each host all of the data contained in that partition.
One shard is the primary, and the others are replicas, which are redundant
copies of the data in the primary shard. A primary shard is the only
partition instance that allows transactions to write to the cache.
A replica shard is a "mirrored" instance of the partition. It receives
updates synchronously or asynchronously from the primary shard. The
replica shard only allows transactions to read from the cache. Replicas
are never hosted in the same container as the primary and are not
normally hosted on the same machine as the primary.
To increase the availability of the data, or increase persistence
guarantees, replicate the data. However, replication adds cost to
the transaction and trades performance in return for availability.
With eXtreme Scale, you can
control the cost as both synchronous and asynchronous replication
is supported, as well as hybrid replication models using both synchronous
and asynchronous replication modes. A synchronous replica shard receives
updates as part of the transaction of the primary shard to guarantee
data consistency. A synchronous replica can double the response time
because the transaction has to commit on both the primary and the
synchronous replica before the transaction is complete. An asynchronous
replica shard receives updates after the transaction commits to limit
impact on performance, but introduces the possibility of data loss
as the asynchronous replica can be several transactions behind the
primary.
Clients
Clients connect to a catalog service,
retrieve a description of the server topology, and communicate directly
to each server as needed. When the server topology changes because
new servers are added or existing servers have failed, the client
is automatically routed to the appropriate server that is hosting
the data. Clients must examine the keys of application data to determine
which partition to route the request. Clients can read data from multiple
partitions in a single transaction. However, clients can update only
a single partition in a transaction. After the client updates some
entries, the client transaction must use that partition for updates.
The
possible deployment combinations are included in the following list:
- A catalog service exists in its own grid of Java Virtual Machines. A single catalog service
can be used to manage multiple instances of eXtreme Scale.
- A container can be started in a JVM by itself or can be
loaded into an arbitrary JVM with
other containers for different ObjectGrid instances.
- A client can exist in any JVM and
communicate with one or more ObjectGrid instances. A client can also
exist in the same JVM as
a container.
Figure 7. Possible topologies
Catalog service
The catalog service hosts logic
that should be idle during a steady state and has little influence
on scalability. The catalog service is built to service hundreds of
containers becoming available simultaneously and runs services to
manage the containers.
Figure 8. Catalog service
The catalog responsibilities consist of the following services:
- Location service
- The location service provides locality for clients that are looking
for containers hosting applications and for containers that are looking
to register hosted applications with the placement service. The location
service runs in all of the grid members to scale out this function.
- Placement service
- The placement service is the central nervous system for the grid
and is responsible for allocating individual shards to their host
container. The placement service runs as a one-of-N elected service
in the cluster so there is always exactly one instance of the placement
service running. If that instance should stop, another process takes
over. All states of the catalog service is replicated across all servers
hosting the catalog service for redundancy.
- Core group manager
- The core group manager manages peer grouping for health monitoring,
organizes containers into small groups of servers, and automatically
federates the groups of servers. When a container first contacts the
catalog service, the container waits to be assigned to either a new
or an existing group of several Java virtual
machines (JVM). Each group of Java virtual
machines monitors the availability of each of its members through
heartbeating. One of the group members relays availability information
to the catalog service to allow for reacting to failures by reallocation
and route forwarding.
- Administration
- The four stages of administering your WebSphere eXtreme Scale environment are planning,
deploying, managing, and monitoring. See the Administration
Guide for more information on
each stage.
For availability, configure a catalog service grid.
A catalog service grid consists of multiple Java virtual machines, including a master JVM
and a number of backup Java virtual
machines.
Figure 9. Catalog service grid