The container server
stores application data for the data
grid. This data is generally broken into parts, which are called partitions.
Partitions are hosted across multiple shard containers. Each container
server in turn hosts a subset of the complete data. A JVM might host
one or more shard containers and each shard container can host multiple
shards.
Remember: Plan out the
heap size for the container
servers, which host all of your data. Configure the heap settings
accordingly.
Figure 1. Container server
Partitions host a subset
of the data in the grid.
WebSphere® eXtreme Scale automatically places
multiple partitions in a single shard container and spreads the partitions
out as more container servers become available.
Important: Choose
the number of partitions carefully before final deployment because
the number of partitions cannot be changed dynamically. A hash mechanism
is used to locate partitions in the network and eXtreme Scale cannot rehash the
entire data set after it has been deployed. As a general rule, you
can overestimate the number of partitions
Shards are
instances of partitions and have one of two roles: primary
or replica. The primary shard and its replicas make up the physical
manifestation of the partition. Every partition has several shards
that each host all of the data contained in that partition. One shard
is the primary, and the others are replicas, which are redundant copies
of the data in the primary shard. A primary shard is the only partition
instance that allows transactions to write to the cache. A replica
shard is a "mirrored" instance of the partition. It receives updates
synchronously or asynchronously from the primary shard. The replica
shard only allows transactions to read from the cache. Replicas are
never hosted in the same container server as the primary and are not
normally hosted on the same machine as the primary.
To
increase the availability of the data, or increase persistence
guarantees, replicate the data. However, replication adds cost to
the transaction and trades performance in return for availability.
With eXtreme Scale, you can
control the cost as both synchronous and asynchronous replication
is supported, as well as hybrid replication models using both synchronous
and asynchronous replication modes. A synchronous replica shard receives
updates as part of the transaction of the primary shard to guarantee
data consistency. A synchronous replica can double the response time
because the transaction has to commit on both the primary and the
synchronous replica before the transaction is complete. An asynchronous
replica shard receives updates after the transaction commits to limit
impact on performance, but introduces the possibility of data loss
as the asynchronous replica can be several transactions behind the
primary.