InfoCenter Home >
6: Administer applications >
6.6: Tools and resources quick reference >
6.6.14: Administering database connections (overview) >
6.6.14.10: Establishing database failover support with HACMP

6.6.14.10: Establishing database failover support with HACMP

This document describes a high availability database failover scenario for IBM WebSphere Application Server Advanced Edition using two AIX systems and a shared external disk that stores the databases used by WebSphere applications. The scenario uses an IBM product called High Availability Cluster Multi-Processing (HACMP).

For some steps in the setup procedure, you are referred to related documents for more detailed instructions. If you need these documents, see the Related information at the bottom of this article for links.

Configuration overview

A cluster of two AIX systems can be used to build a high availability database environment. One AIX system is used as the primary DB2 server. The second system is used as the backup DB2 server system providing standby failover support for when the first system has a failure.

The cluster in the figure below demonstrates a two node Hot Standby configuration. It shows a common cascading scenario. Resources move to the second "hot standby" node if the primary node fails. The two RS/6000 machines run DB2 server software and the HACMP software.

The HACMP software is configured to control the DB2 software that accesses the databases located on the shared disk. By using a shared disk, either system can access the same databases, including the WebSphere administrative database (named "WAS" by default) and any other databases supporting applications managed by WebSphere Application Server.

This eliminates the first primary AIX system and its DB2 server software from being a single point of failure.

IBM WebSphere Application Server is installed on one or more other systems that will access the DB2 databases remotely over the network. Initially, it will connect through the first DB2 server system. When that first DB2 primary system has a failure, the backup system takes over as if it is the primary system and assumes the TCP/IP address of the first system.

When a failure occurs, HACMP will:

  1. Detect the failure, such as a system, network, or application failure
  2. Stop the DB2 server on the first system
  3. Release the shared resources from the first system (disk volume groups)
  4. Assume the service IP address on the standby adapter of the second system
  5. Assign the shared resources to the second system
  6. Start DB2 server on the second system

Hardware and software

In addition to IBM WebSphere Application Server, this scenario requires:

Software Hardware
  • Supported AIX version
  • HACMP Version 4.3.1
  • Supported DB2 version
  • 2 RS/6000 workstations
  • 4 Token Ring 16/4 adapters
  • 1 shared external disk configuration
    using Serial Storage Architecture (SSA)
  • 1 IBM serial null modem cable

See the product prerequisites for information about supported software.

Building a two node hot standby HACMP cluster

The following procedure demonstrates how to set up a two node hot standby HACMP cluster. For more detail, consult the HACMP for AIX - Installation Guide.

  1. Install network adapters in the cluster nodes

    Install two adapters on each node:

    • service/boot adapter
    • standby adapter.
  2. Configure the network settings

    Configure the four adapters on the cluster nodes. First, configure the two standby adapters with the standby TCP/IP addresses.

    Second, configure the two service adapters with the boot addresses. The service addresses for the same two service adapters will be configured later by HACMP.

    Notice that the service adapter and standby adapters have to be on the same physical network but on different subnets. Some details are shown on the HACMP Test Environment figure in the configuration overview section.

  3. Interconnect the workstations for HACMP

    Use the null modem cable to connect the two nodes through their serial ports. This serial connection will act as a private network between the two HACMP cluster nodes and will carry the "keep alive" packets between them without using the public TCP/IP network.

    Test the RS232 network by issuing the command:

    stty < /dev/ttyx
    on each system. The stty attributes should be displayed on both systems.

    As an alternative to using the RS232 connection, if you use a SCSI device or SSA shared disk system, you can set the Target Mode SCSI/SSA connections to provide an alternative serial network. For details, see the HACMP for AIX - Installation Guide.

  4. Install shared disk devices

    The application data for applications being managed by IBM WebSphere Application Server needs to be on a shared device that both nodes can access. It can be a shared disk or a network file system. The device itself should be mirrored or have data protection to avoid data corruption.

    The configuration described in the configuration overview depicts an IBM SSA Disk Subsystem for this purpose.

  5. Define shared volume groups and file systems

    Creating the volume groups, logical volumes, and file systems shared by the nodes in an HACMP cluster requires steps to be performed on all nodes in the cluster.

    In general, define the components on one node and then import the volume group on the other nodes in the cluster. This ensures that the ODM definitions of the shared components are the same on all nodes in the cluster.

    Whether to define a Non-concurrent access or Concurrent access volume group depends on how you set up the cluster. In the hot standby configuration, a shared Non-Concurrent access volume group with a Journaled file system was used so that only one node can access the volume group and the file system at a time. HACMP will switch the resource from one node to the other node.

    To learn more about defining shared volume groups and file systems, see the HACMP for AIX - Installation Guide and your AIX documentation.

  6. Install HACMP software

    Use "smit install_latest" to install:

    • cluster.adt
    • cluster.base
    • cluster.clvm
    • cluster.cspoc
    and related files on both nodes. HAView, a monitor tool, is not needed in the configuration.

  7. Install DB2 server

    Install DB2 server on both nodes. The DB2 installation path can either be in a directory shared by both nodes or on a non-shared file system. When using a none-shared file system, the installation level must be identical.

  8. Create DB2 instance

    Create a DB2 instance for the database. The DB2 instance path, as with the installation path can either be on a shared file system or on a manually mirrored file system. For the configuration discussed in the overview, the DB2 instance was created on the shared SSA disk system.

  9. Confirm that the WAS database exists for the IBM WebSphere administrative server to use

    Make sure the WAS database exists. If not, create one. In either case, ensure that the application heap size (APPLHEAPSZ) of the database is set to 256.

    To manually create the WAS database and set the application heap size, execute these commands:

    db2 CREATE DATABASE was
    db2 UPDATE DB CFG FOR was USING APPLHEAPSZ 256
    

    If you later need to repeat the installation procedure, be sure to drop the WAS database before you install again. Use IBM DB2 Control Center or the following command to drop the database:

    db2 DROP DATABASE WAS
  10. Define the cluster topology

    Use "smit hacmp" to define clusters, cluster nodes, network adapters and network modules. In the configuration above, "cascade" was used so node 1 always has higher priority than node 2.

    For details, see the HACMP for AIX - Installation Guide.

  11. Configure cluster resources

    In HACMP terms, an "application server" is a cluster resource made highly available by the HACMP software. In the configuration shown in the overview section, the DB2 instance on the shared disk is the "application server."

    An application server has a start script and a stop script. The start script starts the application server. The stop script stops the application server so that the application resource can be released, allowing the second node to take it over and restart the application.

    Sample start script (start.sh):
    db2start
    Sample stop script (stop.sh):
    db2 force application all; db2stop
    Sample start usage:
    su - db2inst1   start.sh
    Sample stop usage:
    su - db2inst1   stop.sh

    Create the start and stop scripts for both cluster nodes. Configure the application server with the path to them.

  12. Start the HACMP cluster

    Start HACMP on the first node. Start HACMP on the second node after the start is complete on the first node. Use the /tmp/cm.log file to monitor the cluster events.

WebSphere and Web server configuration options

Now you can install and configure WebSphere and a Web server on other systems. You can set up the front end systems in a variety of ways. It is most simple to use just one system to run both a Web server and WebSphere Version 3.02. This is appropriate for a test environment.

  1. Install the DB2 client software
  2. Configure the DB2 client to connect to the remote WAS database
  3. Install the Web server and WebSphere Application Server.

    When the Application Server installation prompts you for information about the administrative database, specify the previously configured remote database. (Alternatively, you can change the database setting by modifying the "com.ibm.ejs.sm.adminServer.dbUrl" directive in the admin.config file).

Performing a controlled failover to verify the configuration

  1. Start HACMP on the first node. Monitor the /tmp/cm.log file for completion messages.
  2. Start HACMP on the second node.
  3. On the WebSphere system, start the WebSphere administrative server. Monitor the /logs/tracefile for the message:
    WebSphere Administration server open for e-business
  4. Start the WebSphere administrative console and any application servers.
  5. Initiate a controlled failover on the first node:
    1. Issue the command:
      smit hacmp
    2. From the menus, select Cluster Services -> Stop Cluster Services -> Stop now with "Shutdown mode - graceful with takeover." You can monitor the "/tmp/cm.log" on both systems to watch the progress.
  6. When the failover is complete, refresh or start a new administrative console. Check the topology to see that the servers are functional.

Current limitations and considerations

During times when there is no active database connection, such as when the remote database server is stopped, and soon after the database connection is reestablished, using a Java administrative console (WebSphere Administrative Console) will produce some warning and error messages. The most common messages are:

getAttributeFailure
Failed to roll back ... Connection is closed

In such circumstances, it is recommended that you close the console and start a new one.

Go to previous article: Notes about InstantDB Go to next article: Recovering from data source configuration problems using the XAResources file

 

 
Go to previous article: Notes about InstantDB Go to next article: Recovering from data source configuration problems using the XAResources file