Administration Guide

HACMP ES Event Monitoring and User-defined Events

Shutting down DB2 database partitions on an AIX physical node when paging space reaches a certain percentage of fullness, and restarting a DB2 database partition, or initiating a failover operation if a process dies on a given node, are two examples of user-defined events. Examples that illustrate user-defined events, such as shutting down a database partition and forcing a transaction abort to free paging space, can be found in the samples subdirectory.

A rules file, /user/sbin/cluster/events/rules.hacmprd, contains HACMP events. Each event description in this file has the following nine components:

Event name, which must be unique.
State, or qualifier for the event. The event name and state are the rule triggers. HACMP ES Cluster Manager initiates recovery only if it finds a rule with a trigger corresponding to the event name and state.
Resource program path, a full-path specification of the xxx.rp file containing the recovery program.
Recovery type. This is reserved for future use.
Recovery level. This is reserved for future use.
Resource variable name, which is used for Event Manager events.
Instance vector, which is used for Event Manager events. This is a set of elements of the form "name=value". The values uniquely identify the copy of the resource in the system and, by extension, the copy of the resource variable.
Predicate, which is used for Event Manager events. This is a relational expression between a resource variable and other elements. When this expression is true, the Event Management subsystem generates an event to notify the Cluster Manager and the appropriate application.
Rearm predicate, which is used for Event Manager events. This is a predicate used to generate an event that alters the status of the primary predicate. This predicate is typically the inverse of the primary predicate. It can also be used with the event predicate to establish an upper and a lower boundary for a condition of interest.

Each object requires one line in the event definition, even if the line is not used. If these lines are removed, HACMP ES Cluster Manager cannot parse the event definition properly, and this may cause the system to hang. Any line beginning with "#" is treated as a comment line.
Note: The rules file requires exactly nine lines for each event definition, not counting any comment lines. When adding a user-defined event at the bottom of the rules file, it is important to remove the unnecessary empty line at the end of the file, or the node will hang.

Following is an example of an event definition for node_up:

   ##### Beginning of the Event Definition: node_up
   #
   TE_JOIN_NODE
   0
   /usr/sbin/cluster/events/node_up.rp
   2
   0
   # 6) Resource variable - only used for event management events
 
   # 7) Instance vector - only used for event management events
 
   # 8) Predicate  - only used for event management events
 
   # 9) Rearm predicate - only used for event management events
 
   ###### End of the Event Definition: node_up

This example is just one of the event definitions that can be found in the rules.hacmprd file. In this example, the recovery program /usr/sbin/cluster/events/node_up.rp is invoked when the node_up event occurs. Values are specified for the state, recovery type, and recovery level. There are four empty lines for resource variable, instance variable, predicate, and rearm predicate.

You can define other events to react to non-standard HACMP ES events. For example, to define the event that the /tmp file system is over 90 per cent full, the rules.hacmprd file must be modified.

Many events are predefined in the IBM Parallel System Support Program (PSSP). These events can be exploited (when used within user-defined events) as follows:

Stop the cluster.
Edit the rules.hacmprd file. Back up the file before modifying it. Add the predefined PSSP event manually. If you need synchronizing points across all nodes in the cluster, use the barrier command in the recovery program. (Read more about the barrier command, and synchronization of recovery programs in the HACMP Concepts, Installation, and Administration Guides.)
Restart the cluster. The rules.hacmprd file is stored in memory when Cluster Manager is started. To accurately implement the changes, restart all the clusters. There should not be any inconsistent rules in a cluster.
Cluster Manager uses all events in the rules.hacmprd file.

HACMP ES uses PSSP event detection to treat user-defined events. The PSSP Event Management subsystem provides comprehensive event detection by monitoring various hardware and software resources.

Resource states are represented by resource variables. Resource conditions are represented as expressions called predicates.

Event Management receives resource variables from the Resource Monitor, which observes the state of specific system resources and transforms this state into several resource variables. These variables are periodically passed to Event Management. Event Management applies predicates that are specified by the HACMP ES Cluster Manager in rules.hacmprd for each resource variable. When the predicate is evaluated as being true, an event is generated and sent to the Cluster Manager. Cluster Manager initiates the voting protocol, and the recovery program file (xxx.rp) is run (according to event priority) on a set of nodes specified by "node sets" in the recovery program.

The recovery program file (xxx.rp) is made up of one or more recovery program lines. Each line is declared in the following format:

   relationship     command_to_run     expected_status     NULL

There must be at least one space between each value in the line. "Relationship" is a value used to decide which program should run on which kind of node. Three types of relationship are supported:

All. The specified command or program is run on all nodes of the current HACMP cluster.
Event. The specified command or program is run only on the nodes on which the event occurred.
Other. The specified command or program is run on all nodes on which the event did not occur.

"Command_to_run" is a quotation mark-delimited string, with or without a full-path specification to an executable program. Only HACMP-delivered event scripts can use a relative-path definition. Other scripts or programs must use the full-path specification, even if they are located in the same directory as the HACMP event scripts.

"Expected_states" is the return code of the specified command or program. It is either an integer value, or an "x". If "x" is used, Cluster Manager does not care about the return code. All other codes must be equal to the expected return code, otherwise Cluster Manager detects the event failure. The handling of this event "hangs" the process until recovery (through manual intervention) occurs. Without manual intervention, the node does not synchronize with the other nodes. Synchronization across all nodes is required for the Cluster Manager to control all the nodes.

"NULL" is a field reserved for future use. The word "NULL" must appear at the end of each line except the barrier line. If you specify multiple recovery commands between two barrier commands, or before the first one, the recovery commands are run in parallel on the node itself, and between the nodes.

The barrier command is used to synchronize all the commands across all the cluster nodes. When a node hits the barrier statement in the recovery program, Cluster Manager initiates the barrier protocol on this node. Since the barrier protocol is a two-phase protocol, all nodes are notified that both phases have completed when all of the nodes have met the barrier in the recovery program, and "voted" to approve the protocol.

The process can be summarized as follows:

Either Group Services/ES (for predefined events) or Event Management (for user-defined events) notifies HACMP ES Cluster Manager of the event.
Cluster Manager reads the rules.hacmprd file, and determines the recovery program that is mapped to the event.
Cluster Manager runs the recovery program, which consists of a sequence of recovery commands.
The recovery program executes the recovery commands, which may be shell scripts or binary commands. (In HACMP for AIX, the recovery commands are the same as the HACMP event scripts.)
Cluster Manager receives the return status from the recovery commands. An unexpected status "hangs" the cluster until manual intervention (using smit cm_rec_aids or the /usr/sbin/cluster/utilities/clruncmd command) is carried out.

HACMP ES Script Files

The following sample scripts for failover recovery and user-defined events are included with DB2 UDB EEE. The script files are located in the $INSTNAME/sqllib/samples/hacmp/es directory. The scripts will work "as is", or you can customize the recovery action.

DB2 database partition recovery script rc.db2pe. This is the script file used to start and stop the HACMP configuration on a database partition. It also works as an HACMP start and stop script for an NFS server of the DB2 instance owner.
DB2-specific user-defined events for HACMP ES. Six default events are included: one for process recovery, two for paging space, and three for NFS and automounter recovery.
DB2 instance NFS file server failover. This script provides failover recovery of the file system server for a DB2 instance to a backup.
Network failover. The scripts network_up_complete, network_back, network_down_complete, and network_down allow SP DB2 database partitions to failover if their SP switch adapter fails.
Scripts to define monitoring events for the SP GUI Perspectives. Monitoring of failover and user-defined recovery is possible through the Event and Hardware Perspectives. Read the documentation for PSSP Administration to find out more about Perspectives.
Installation scripts to install and remove core scripts and events on the HACMP ES nodes.
Script files to create and remove the SP Perspectives problem management (pman) resources for monitoring the HACMP and DB2 configuration.

The recovery scripts must be installed on each node that will run recovery operations. The script files can be centrally installed from the SP control workstation or other designated SP node:

Copy the scripts from the $INSTNAME/sqllib/samples/hacmp/es directory to one of either the SP control workstation or another SP node that can run the pcp and pexec commands. These commands are required for the install operation.
Customize the reg.parms.SAMPLE and failover.parms.SAMPLE files for your environment by setting key parameters (such as BUFFPAGE) for failover configurations. Typically, for mutual takeover configurations, your failure settings will be adjusted lower to one-half the size of your regular settings or less. Also, you will use a copy of these files renamed with your own name (instead of "SAMPLE").
Customize (as necessary) the five parameters NFS_RETRIES, START_RETRIES, MOUNT_NFS, STOP_RETRIES, and FAILOVER in the rc.db2pe file. The retry and failover settings should be adequate for most implementations. The MOUNT_NFS setting should be configured, depending on whether you will be using the package for NFS server availability. You should specify this setting if you want rc.db2pe to mount and verify the NFS home directory of the DB2 instance owner for you. Setting the FAILOVER parameter to "YES" will invoke db2_proc_restart and launch an attempt to restart a DB2 database partition. If the restart operation is unsuccessful, HACMP will shut down with a failover.
Customize db2_paging_action, db2_proc_recovery, and nfs_auto_recovery in the event file. Edit pwq to change this to the DB2 instance owner. Customize db2_paging_action to specify which action is to be taken if paging space becomes more that ninety percent full. (If this does occur, the DB2 database partition is stopped.) Modify the script if additional recovery actions are required.

Use db2_inst_ha to install the scripts and events on the nodes you specify. (HACMP ES must be pre-installed on these nodes before you begin.) The syntax of db2_inst_ha is:

   db2_inst_ha $INSTNAME/sqllib/samples/hacmp/es <nodelist> <DATABASENAME>
 
   where
 
      $INSTNAME/sqllib/samples/hacmp/es is the directory in which the
         scripts and the event are located
      <nodelist> is the pcp or pexec style of the nodes; for example, 
         1-16 or 1,2,3,4
      <DATABASENAME> is the name of the database for regular and 
         failover parameter files.

The reg.parms.SAMPLE and failover.parms.SAMPLE files will be copied to each node and renamed reg.parms.DATABASENAME. db2_inst_ha copies files to each node in /usr/bin, and updates the HACMP event files:

   /usr/sbin/cluster/events/rules.hacmprd
   /usr/sbin/cluster/events/network_up_complete
   /usr/sbin/cluster/events/network_down_complete

Configure your system and scripts with HACMP.
Use the create_db2_events command to install the monitoring events for problem management resources (pman) and the SP GUI Perspectives. Additional configuration and customization in Perspectives is needed. For more information about Perspectives, read the PSSP Administration Guide.

Use the ha_db2stop command to shutdown the database partitions without HACMP ES failover recovery taking place. To use this command, copy the file to the database user's home directory and make sure permissions and ownership are set for that user. To stop the database without failover recovery, then as that user, type:

   ha_db2stop

Note: You must wait for the command to return. Exiting by using a ctrl-C interrupt, or by killing the process, may re-enable failover recovery prematurely, and some database partitions may not be stopped.

DB2 Recovery Script Operations with HACMP ES

HACMP ES invokes the DB2 recovery scripts in the following way:

node_up_local (starting a node)
HACMP runs the node_up sequence, acquiring volume groups, logical volumes, file systems, and IP addresses specified in resource groups that are owned (through cascading) or assigned (through rotating) to this node.
When node_up_local_complete is run, the application server definition that contains rc.db2pe is initiated to start the database partition specified in the application server definitions on this physical node.
Note: rc.db2pe, when running in start mode, adjusts the DB2 parameters specified in reg.parms.DATABASE for each DATABASE in the database directory that matches a parameter (parms) file.

Each node follows this sequence when starting. If you have multiple HACMP clusters and start them in parallel, multiple nodes are brought up at once.

node_down_remote (failover)

HACMP acquires volume groups, logical volumes, file systems, and IP addresses that are specified in the resource group on the designated takeover node.

When node_down_remote_complete is run, HACMP will run rc.db2pe as the application server start script specified in the resource group for this database partition.
Note: rc.db2pe, when running in mutual takeover mode, stops the DB2 database partition running on it, adjusts the DB2 parameters specified in failover.parms.DATABASE for each DATABASE in the database directory that matches a parameter (parms) file, and then starts both database partitions on the physical takeover node.

node_up_remote (reintegration of a failed node - cascading mutual takeover resource group)

When node_up_remote is run on the old takeover node, the application server definition causes rc.db2pe to be run in stop mode.
Note: rc.db2pe, when running in a reintegration mode (mutual takeover), stops both of the database partitions running on it, adjust the DB2 parameters specified in reg.parms.DATABASE for each DATABASE in the database directory that matches a parameter (parms) file, and then starts just the database partition to be kept on this physical takeover node.

The old takeover node releases volume groups, logical volumes, file systems, and IP addresses specified in resource groups that are to be owned by the reintegrating node.

HACMP re-acquires volume groups, logical volumes, file systems, and IP addresses specified in the resource group that is now owned by the reintegrating node.

When node_up_local_complete is run, the application server definition that contains rc.db2pe is initiated to start the DB2 database partition specified in the application server definition on this reintegrating physical node.
Note: rc.db2pe, when running in start mode, adjusts the DB2 parameters specified in reg.parms.DATABASE for each DATABASE in the database directory that matches a parameter (parms) file.

node_down_local (node stop or stop with takeover)

When node_down_local is run on the stopping node, the application server definition causes rc.db2pe to be run in stop mode.
Note: rc.db2pe, when running in stop mode, adjusts the DB2 parameters specified in failover.parms.DATABASE for each DATABASE in the database directory that matches a parameter (parms) file, and then stops the DB2 database partition (this is for takeover).

HACMP releases volume groups, logical volumes, file systems, and IP addresses specified in resource groups that are now owned by the node.

db2_proc_recovery (db2 process death)
All nodes run the db2_proc_restart script. The node on which the failure occurred restarts the correct DB2 database partition.
db2_paging_recovery (paging space recovery)
All nodes run the db2_paging_action script. If a node has more than 70 percent of paging space filled, a wall command is issued. If a node has more than 90 percent of paging space filled, DB2 database partitions on this physical node are stopped and then restarted.
nfs_auto_recovery (nfs or automount process failure)
All nodes run the rc.db2pe script in NFS mode. If an NFS process stops running, it is restarted. Similarly, if the automount process stops running, it is restarted.
network_down_complete (network failure - SP switch)
The net_down script is called. This verifies the network as the SP switch network, and verifies that it is down. If that is the case, it waits a user-defined time interval. The default time interval is 100 seconds.
If the SP switch network comes back, as indicated by an network_up_complete event, no recovery is effected.
If the time limit is reached, HACMP is stopped with failover.

Note:

All events can be monitored through SP problem management and the SP Perspectives GUI.

Other Script Utilities

Other script utilities are available for your use, including:

ha_cmd, a command provided to start HACMP on SP nodes from the control workstation. The syntax is:

   ha_cmd <noderange> <START|STOP|TAKE|FORCE>
 
   where
 
      <noderange> is a pcp or pexec style of SP noderange.
         For example, "ha_cmd 3-6 START" would start HACMP on nodes 3,4,5,6.
                      "ha_cmd 5 TAKE" would shut down HACMP on node 5
                         for mutual takeover.

ha_mon, a command for monitoring HACMP hacmp_out files from the SP control workstation. The syntax is:

   ha_mon <node>
 
   where
 
      <node> is the SP node to be monitored.
 
      ha_mon will "tail -f" the /tmp/hacmp.out file on the node you specify.

db2_turnoff_recov, a command for temporarily disabling all HACMP (non-failover) recovery, and designed for extremely rare situations. No DB2 process, paging, NFS, or automounter recovery is initiated. This function removes the event stanzas for that recovery from the HACMP rules file. HACMP must be stopped and restarted. The syntax is:
```
   db2_turnoff_recov <nodelist>
```
db2_turnon_recov, a command to re-enable HACMP (non-failover) recovery. This command is used after db2_turnoff_recov to restore HACMP rules files, so that user-defined event recovery can occur. HACMP must be stopped and restarted. The syntax is:
```
   db2_turnon_recov <nodelist>
```

[ Top of Page | Previous Page | Next Page ]