InfoCenter Home >
6: Administer applications >
6.6: Tools and resources quick reference >
6.6.14: Administering database connections (overview) >
6.6.14.10: Establishing database failover support with HACMP
6.6.14.10: Establishing database failover support with HACMP
This document describes a high availability database failover
scenario for IBM WebSphere Application Server Advanced Edition using two
AIX systems and a shared external disk that stores the databases
used by WebSphere applications. The scenario uses an IBM product
called High Availability Cluster Multi-Processing (HACMP).
For some steps in the setup procedure, you are referred to
related documents for more detailed instructions. If you need these
documents, see the Related information at the bottom of this
article for links.
A cluster of two AIX systems can be used to build a high
availability database environment. One AIX system is used as the
primary DB2 server. The second system is used as the backup DB2
server system providing standby failover support for when the first
system has a failure.
The cluster in the figure below demonstrates a two node Hot
Standby configuration. It shows a common cascading scenario.
Resources move to the second "hot standby" node if the primary node
fails. The two RS/6000 machines run DB2 server software and the
HACMP software.
The HACMP software is configured to control the DB2 software that
accesses the databases located on the shared disk. By using a shared
disk, either system can access the same databases, including the
WebSphere administrative database (named "WAS" by default) and any
other databases supporting applications managed by WebSphere
Application Server.
This eliminates the first primary AIX system and its DB2 server
software from being a single point of failure.
IBM WebSphere Application Server is installed on one or more
other systems that will access the DB2 databases remotely over the
network. Initially, it will connect through the first DB2 server
system. When that first DB2 primary system has a failure, the backup
system takes over as if it is the primary system and assumes the
TCP/IP address of the first system.
When a failure occurs, HACMP will:
- Detect the failure, such as a system, network, or application
failure
- Stop the DB2 server on the first system
- Release the shared resources from the first system (disk
volume groups)
- Assume the service IP address on the standby adapter of the
second system
- Assign the shared resources to the second system
- Start DB2 server on the second system
In addition to IBM WebSphere Application Server, this scenario requires:
Software |
Hardware |
- Supported AIX version
- HACMP Version 4.3.1
- Supported DB2 version
|
- 2 RS/6000 workstations
- 4 Token Ring 16/4 adapters
- 1 shared external disk configuration
using Serial
Storage Architecture (SSA)
- 1 IBM serial null modem cable
|
See the product prerequisites for information about supported software.
The following procedure demonstrates how to set up a two node hot
standby HACMP cluster. For more detail, consult the HACMP
for AIX - Installation Guide.
- Install network adapters in the cluster nodes
Install two adapters on each node:
- service/boot adapter
- standby adapter.
- Configure the network settings
Configure the four adapters on the cluster nodes. First,
configure the two standby adapters with the standby TCP/IP
addresses.
Second, configure the two service adapters with the boot
addresses. The service addresses for the same two service adapters
will be configured later by HACMP.
Notice that the service adapter and standby adapters have to be
on the same physical network but on different
subnets. Some details are shown on the HACMP Test Environment
figure in the configuration overview section.
- Interconnect the workstations for HACMP
Use the null modem cable to connect the two nodes through their
serial ports. This serial connection will act as a private network
between the two HACMP cluster nodes and will carry the "keep
alive" packets between them without using the public TCP/IP
network.
Test the RS232 network by issuing the command: stty < /dev/ttyx on each system. The stty attributes
should be displayed on both systems.
As an alternative to using the RS232 connection, if you use a
SCSI device or SSA shared disk system, you can set the Target Mode
SCSI/SSA connections to provide an alternative serial network. For
details, see the HACMP for AIX - Installation Guide.
- Install shared disk devices
The application data for applications being managed by IBM
WebSphere Application Server needs to be on a shared device that
both nodes can access. It can be a shared disk or a network file
system. The device itself should be mirrored or have data
protection to avoid data corruption.
The configuration described in the configuration overview
depicts an IBM SSA Disk Subsystem for this purpose.
- Define shared volume groups and file systems
Creating the volume groups, logical volumes, and file systems
shared by the nodes in an HACMP cluster requires steps to be
performed on all nodes in the cluster.
In general, define the components on one node and then import
the volume group on the other nodes in the cluster. This ensures
that the ODM definitions of the shared components are the same on
all nodes in the cluster.
Whether to define a Non-concurrent access or Concurrent access
volume group depends on how you set up the cluster. In the hot
standby configuration, a shared Non-Concurrent access volume group
with a Journaled file system was used so that only one node can
access the volume group and the file system at a time. HACMP will
switch the resource from one node to the other node.
To learn more about defining shared volume groups and file
systems, see the HACMP for AIX - Installation Guide and
your AIX documentation.
- Install HACMP software
Use "smit install_latest" to install:
- cluster.adt
- cluster.base
- cluster.clvm
- cluster.cspoc
and related files on both nodes.
HAView, a monitor tool, is not needed in the configuration.
- Install DB2 server
Install DB2 server on both nodes. The DB2 installation path can
either be in a directory shared by both nodes or on a non-shared
file system. When using a none-shared file system, the
installation level must be identical.
- Create DB2 instance
Create a DB2 instance for the database. The DB2 instance path,
as with the installation path can either be on a shared file
system or on a manually mirrored file system. For the
configuration discussed in the overview, the DB2 instance was
created on the shared SSA disk system.
- Confirm that the WAS database exists for the IBM WebSphere
administrative server to use
Make sure the WAS database exists. If not, create one.
In either case, ensure that the application
heap size (APPLHEAPSZ) of the database is set to 256.
To manually create the WAS
database and set the application heap size, execute these
commands:
db2 CREATE DATABASE was
db2 UPDATE DB CFG FOR was USING APPLHEAPSZ 256
If you later need to repeat the installation procedure, be sure
to drop the WAS database before you install again. Use IBM DB2
Control Center or the following command to drop the database:
db2 DROP DATABASE WAS
- Define the cluster topology
Use "smit hacmp" to define clusters, cluster nodes, network
adapters and network modules. In the configuration above,
"cascade" was used so node 1 always has higher priority than node
2.
For details, see the HACMP for AIX - Installation
Guide.
- Configure cluster resources
In HACMP terms, an "application server" is a cluster resource
made highly available by the HACMP software. In the configuration
shown in the overview section, the DB2 instance on the shared disk
is the "application server."
An application server has a start script and a stop script. The
start script starts the application server. The stop script stops
the application server so that the application resource can be
released, allowing the second node to take it over and restart the
application.
Sample start script (start.sh): |
db2start |
Sample stop script (stop.sh): |
db2 force application all; db2stop |
Sample start usage: |
su - db2inst1 start.sh |
Sample stop usage: |
su - db2inst1 stop.sh |
Create the start and stop scripts for both cluster nodes.
Configure the application server with the path to them.
- Start the HACMP cluster
Start HACMP on the first node. Start HACMP on the second node
after the start is complete on the first node. Use the /tmp/cm.log
file to monitor the cluster events.
Now you can install and configure WebSphere and a Web server on
other systems. You can set up the front end systems in a variety of
ways. It is most simple to use just one system to run both a Web
server and WebSphere Version 3.02. This is appropriate for a test
environment.
- Install the DB2 client software
- Configure the DB2 client to connect to the remote WAS database
- Install the Web server and WebSphere Application Server.
When the Application Server installation prompts you for
information about the administrative database, specify
the previously configured remote database.
(Alternatively, you
can change the database setting by modifying the
"com.ibm.ejs.sm.adminServer.dbUrl" directive in the admin.config
file).
- Start HACMP on the first node. Monitor the /tmp/cm.log file
for completion messages.
- Start HACMP on the second node.
- On the WebSphere system, start the WebSphere administrative
server. Monitor the /logs/tracefile for the message:
WebSphere Administration server open for e-business
- Start the WebSphere administrative console and any application
servers.
- Initiate a controlled failover on the first node:
- Issue the command:
smit hacmp
- From the menus, select Cluster Services -> Stop Cluster
Services -> Stop now with "Shutdown mode - graceful with
takeover." You can monitor the "/tmp/cm.log" on both systems to
watch the progress.
- When the failover is complete, refresh or start a new
administrative console. Check the topology to see that the servers
are functional.
During times when there is no active database connection, such as
when the remote database server is stopped, and soon after the
database connection is reestablished, using a Java administrative
console (WebSphere Administrative Console) will produce some
warning and error messages.
The most common messages are:
getAttributeFailure
Failed to roll back ... Connection is closed
In such circumstances, it is recommended that you close the
console and start a new one.
|
|