Enhanced scalability (ES) is a feature of High Availability Cluster Multi-processing (HACMP) for AIX Version 4.2.2, which currently runs only on RS/6000 SP nodes.
This feature provides the same failover recovery as HACMP, and has the same event structure as previous HACMP versions (see HACMP for AIX, V4.2.2, Enhanced Scalability Installation and Administration Guide). Enhanced scalability also provides:
A rules file (/usr/sbin/cluster/events/rules.hacmprd) contains the HACMP events. User-defined events are added to this file. The script files that are to be run when events occur are part of this definition.
For more information about user-defined events and the rules file, see HACMP ES Event Monitoring and User-defined Events.
The nodes in HACMP ES clusters exchange messages called heartbeats, or keepalive packets, by which each node informs the other nodes about its availability. A node that has stopped responding causes the remaining nodes in the cluster to invoke recovery. The recovery process is called a node_down event and may also be referred to as failover. The completion of the recovery process is followed by the re-integration of the node into the cluster. This is called a node_up event.
There are two types of events: standard events that are anticipated within the operations of HACMP ES, and user-defined events that are associated with the monitoring of parameters in hardware and software components.
One of the standard events is the node_down event. When planning what should be done as part of the recovery process, HACMP allows two failover options: hot (or idle) standby, and mutual takeover.