About High Availability Pairs

A High Availability (HA) pair is an Integration Appliance configuration that connects two physical Integration Appliances together, allowing them to automatically synchronize data and perform fail-over operations. The machines that make up the HA pair share the same MAC and IP addresses, thus creating a single network identity.

There is one Active and one Standby Integration Appliance in an HA pair. The Integration Appliance that actively processes orchestrations is the Active machine. The Integration Appliance that automatically synchronizes data and performs fail-over operations is the Standby machine. When the machines that make up an HA pair initially connect to each other, they immediately synchronize with each other and determine which machine assumes the active role and which machine assumes the standby role. During the initial synchronization operation, the Active machine can processes orchestrations; however, fail-over cannot occur until the HA pair is completely synchronized. Once the HA pair is synchronized, the HA pair persists data and automatically synchronizes this data between the two machines.

When an Active machine failure occurs, the Standby machine initiates a take over procedure, becomes the Active machine, and resumes processing orchestrations exactly where the other machine stopped. The take over process typically takes less than a minute to complete, but DHCP response times can slow the take over process. During the take over procedure, the Standby machine power cycles the Active machine to ensure the Active machine is not still processing orchestrations. Only when the Standby machine can successfully power cycle the failed Active machine does it become the Active machine. If the Standby machine is unable to power cycle the Active machine, the Standby machine goes into an IDLE state.

Examples of failures that can cause an HA fail-over to occur include the following failures:
  • Hardware failures generated by CPUs, hard drives, RAM, motherboards, network interfaces, power supplies, and raid-controllers.
  • Integration Appliance runtime failures, such as fatal errors during processing.
  • Communication failures between the Integration Appliances caused by replication port network interface issues or replication cable issues.

When the Active machine loses contact with the Standby machine, the Active machine stops running orchestrations to prevent an asynchronous data commit and waits to see if the Standby machine initiates the take over procedure. If the Standby machine does not power cycle the Active machine, the Active machine resumes processing orchestrations. When the Standby machine is able to reconnect to the Active machine, the Standby machine synchronizes with the Active machine.

From the WMC, you can monitor an HA pair's status and manipulate the roles of the Integration Appliances in an HA pair.




Feedback | Notices


Timestamp icon Last updated: Wednesday, December 16, 2015


http://pic.dhe.ibm.com/infocenter/wci/v7r0m0/topic/com.ibm.wci.HAOverview.doc/HA_about_HA.html