This topic applies only on the z/OS operating system.

Setting up peer restart and recovery

To allow WebSphere Application Server for z/OS to restart on an alternate system, the following prerequisites must be installed on every system (your original system as well as any systems intended for recovery) before reconfiguring the ARM policies to enable peer restart and recovery.

Before you begin

You must also make sure all of the systems, where you might need to perform restart, are part of the same RRS log group.

Installing the prerequisite service updates on all of these systems will not hinder your current running environment if you want to continue to only restart in place. However, if this service is not installed, there is a possibility that the controller will not be able to move back. OTS will attempt to restart on the alternate system and fail. If there are any URs that are unresolved with RRS once this happens, the controller will not be allowed to restart on the home system until RRS is cancelled on the alternate system. For more information on OTS and RRS, see z/OS MVS Programming: Resource Recovery.

If you do not plan to use peer restart and recovery, you do not need to abide by these functional prerequisites. Your system will instead use the restart-in-place function.

The following products all support RRS. Individually, they also support peer restart and recovery, providing the above prerequisites are all properly installed:
  • DB2 Version 7 or higher
  • IMS Version 8 or higher
  • CICS Version 1.3 or higher
  • MQSeries Version 5.2 or higher

In addition to the preceding products, many JTA XAResource Managers can be used to assist in a WebSphere Application Server for z/OS peer restart and recovery. Consult your JTA XAResource Manager's documentation to determine if it supports restarting on an alternate system.

Important: When setting up the ARM policy for a sysplex, make sure that both systems have the same level of the Application Server installed. For example, you cannot use an application server that is running WebSphere Application Server for z/OS Version 5.1 to perform peer restart and recovery for an application server that is running WebSphere Application Server for z/OS Version 6.0.1.
Prior to using peer restart and recovery:
  • You must ensure that the location service Daemon and node agent are already running on all systems that might be used for recovery. Otherwise, the recovering system might attempt to recover on a system that is not running the location service Daemon and node agent. If this happens, the server will fail to start, and recovery will fail.

Clients will see a performance impact if the systems are running at capacity. In an attempt to minimize the memory and CPU impact on the alternate system, the enterprise bean and Web containers are not restarted for servers running in peer-restart mode. This means that application servers that are in the state of being recovered will not be able to accept any inbound work.

About this task

Important: WebSphere for z/OS uses the z/OS Resource Recovery Services (RRS) system function to provide the same transactional recovery functionality as is provided by the high availability peer recovery support on other platforms. Therefore, high availability peer recovery support is not available on a z/OS platform.

After the prerequisites are installed, starting a server on a system to which it was not configured implicitly places the server into peer restart and recovery mode. If you configured your XA Partner log to write to a non-shared HFS, or if you are using a JTA XA Resource Manager, you need to perform the following steps before starting a server:

Procedure

  1. (Required only if you are using a non-shared HFS.) Enable non-shared HFS support. When using a non-shared HFS, the configuration settings must be replicated across the different systems in the sysplex. This is done automatically by the deployment manager and node agent. To enable this support, each node agent in your configuration must be set as a recovery node. This change is made in the administrative console:
    1. In the administrative console navigation, select System Administration > Node Agents.
    2. Select a node agent from the list.
    3. Under Additional Properties, select File Synchronization Service.
    4. Under Additional Properties, select Custom Properties.
    5. Select New.
    6. Enter recoveryNode for Name, and true for Value. The Description field can be left blank.
    7. Repeat steps 3-7 for each node agent in your configuration.
    8. Save your configuration.
  2. (Required only if you are using JTA XAResource Managers.) Make appropriate logs and classes are available on the alternate system If you plan to use WebSphere Application Server for z/OS peer restart and recovery, and your applications access JTA XAResource Managers, you must ensure that the appropriate logs and classes are available on the alternate system.
    1. Point the WebSphere Application Server for z/OS variable TRANLOG_ROOT to a shared HFS. The TRANLOG_ROOT variable must point to a shared HFS, to which all systems in the WebSphere Application Server for z/OS cell can write. The XA partner log is stored here, and the alternate system must be able to read and update this log.

      Use the administrative console to set the WebSphere Application Server for z/OS variable, TRANLOG_ROOT, to the directory of a shared HFS, to which all systems in the WebSphere Application Server for z/OS cell can write.

      In the administrative console, click Environment > Manage WebSphere Variables. Then click on the TRANLOG_ROOT variable to bring up an new window in which you can specify the directory of the shared HFS.

    2. Store the driver (i.e., JDBC Driver, JMS Provider, or JCA Resource Adapter, etc.) for each JTA XAResource Manager in an HFS that is readable by all systems in the WebSphere Application Server for z/OS cell. For example, if your connector is a JDBC driver for a database, the driver would likely be stored in a read-only HFS that is accessible by all systems in the sysplex. This allows the alternate system to read the saved classpath for the resource, and reconstruct it during a restart.

      If the connector used to access a JTA XAResource Manager is not stored in an HFS that is readable by all systems that might be used for recovery, when an application server restarts on an alternate system, it will either appear that there is no XA recovery work to do, or it will be impossible to load the classes necessary to communicate with the JTA XAResource Manager

  3. Resolve InDoubt units.

    During a recovery, there will be instances when manual intervention is required to resolve InDoubt units. You will need to use RRS panels for this manual intervention.




In this information ...


IBM Redbooks, demos, education, and more

(Index)

Use IBM Suggests to retrieve related content from ibm.com and beyond, identified for your convenience.

This feature requires Internet access.

Task topic    

Terms of Use | Feedback

Last updated: Sep 20, 2010 11:08:29 PM CDT
http://www14.software.ibm.com/webapp/wsbroker/redirect?version=vela&product=was-nd-mp&topic=tprruse
File name: tprr_use.html