DB2 Server for VSE & VM: Operation


Resolving DRDA 2 In-Doubt Logical Units of Work

A DRDA 2 in-doubt logical unit of work, also know as a distributed unit of work (DUOW), occurs when when phase one of the commit processing was completed successfully but phase two did not. A prepare to commit log record for the unit of work was written but the commit log record was not. This situation can occur when a system failure occurs somewhere at the point in time after the prepare to commit has occurred and before the second phase of the commit has occurred.

The system failure could be a DB2 Server for VSE & VM failure, a network failure, a failure on VM CRR for VM or a failure on CICS for VSE, or a failure on the application requester.

When a DUOW is in-doubt, it holds a real agent on the application server. The SHOW CONNECT command would show that a real agent is being held and that it prepared for commit or rollback.

Figure 87. SHOW CONNECT Example of DUOW

+--------------------------------------------------------------------------------+
|show connect                                                                    |
|Status of Connected Users                     1998-04-07  13:42:03              |
|   Checkpoint agent is not active.                                              |
|   User Agent:   1   User-ID: SYSA     SQL-ID: VSEMCH12                         |
|   is prepared for COMMIT or ROLLBACK.                                          |
|   VM ID = VSEMCH12  Coordinator = DBDCCICS     Resource Adapter = 0            |
|   Transaction= ABC  Sigon ID= SYSA         Terminal= D080                      |
|  0  Users are active.                                                          |
|  0  Users are waiting.                                                         |
|  0  Users are inactive.                                                        |
|  2  Agents are available.                                                      |
|  45  User connections are available.                                           |
|ARI0065I Operator command processing is complete.                               |
+--------------------------------------------------------------------------------+

To distinguish a DUOW in-doubt agent from a CICS in-doubt agent, look at the Resource Adapter value. If the resource adapter has a value of 254, it is a DUOW in-doubt on VM. If the resource adapter has a value of 255 is is a DUOW in-doubt on VSE. If the resource adapter has any other value it is a CICS in-doubt. The value in our example is 254, so this is a VM DUOW. For more information we issue the SHOW INDOUBT command.

Figure 88. SHOW INDOUBT Example of DUOW

+--------------------------------------------------------------------------------+
|show indoubt                                                                    |
|Status of in-doubt units of work              1997-03-05  13:46:54              |
|TRANID: 26CD User ID: GMERKE                                                    |
|    is prepared for COMMIT or ROLLBACK.                                         |
|     LUWID: CAIBMOML.OMXNV108.D6BD1D33FA93.0001                                 |
|     EXTNAM: ADHOC.EXE           01CC0001                                       |
|     Requester: DDCS/2    V2.1.1   at BEDROCK                                   |
|     Package: GMERKE.ADHOC        Section: 1                                    |
|     PTC state started: 1997-03-05  13:40:32                                    |
|     Heuristic state started: N/A                                               |
|     Damage: No                                                                 |
|ARI0065I Operator command processing is complete.                               |
+--------------------------------------------------------------------------------+

The SHOW INDOUBT command gives more information for the DUOW. CICS in-doubt units of work will not appear in the SHOW INDOUBT display. This example shows that this DUOW has not been resolved yet. This can be determined because the state is prepared for COMMIT or ROLLBACK and the heuristic state started time is N/A and the damage is no.

Normally, in-doubt units of work are resolved automatically by resynchronization recovery. In VM, resynchronization recovery occurs shortly after the application server has been initialized. It is driven by the VM CRR recovery server. In VSE, resynchronization recovery occurs when the DRDA 2 TRUE is enabled for the application server. This happens when an application requester tries to establish a DRDA 2 connection to the application server. If it is successful, there would be no in-doubt agent displayed by the SHOW CONNECT command and there would be no in-doubt units of work displayed by the SHOW INDOUBT command. If it is unsuccessful, the DUOW must be resolved manually.

The logs on the application requester must be examined to determine what action should be taken. The application requester is the coordinator of the DUOW. Most application requesters also have commands to show which units of work are in-doubt and what there status is. DB2 Common Server has the "DB2 LIST INDOUBT TRANSACTIONS" command. DB2 for MVS has the "DISPLAY THREAD(*) TYPE(INDOUBT) LOCATION(*)" command. The information from the application requester should be used to determine if the DUOW should be committed or rolled back.

The FORCE command is used to resolve the DUOW. The SHOW CONNECT command tells us the in-doubt agent is number 1. If the decision is to commit the transaction, then the FORCE 1 COMMIT command would be issued. Since this is a DUOW in-doubt, you will be prompted to confirm this action. If you are sure, reply 1 for yes and the in-doubt agent will be committed. A subsequent SHOW CONNECT command will show that the agent is free. A subsequent SHOW INDOUBT command will show that the in-doubt was heuristically committed.

show indoubt
Status of in-doubt units of work              1997-03-05  14:06:54
TRANID: 26CD User ID: GMERKE
    is COMMITTED-H.
     LUWID: CAIBMOML.OMXNV108.D6BD1D33FA93.0001
     EXTNAM: ADHOC.EXE           01CC0001
     Requester: DDCS/2    V2.1.1   at BEDROCK
     Package: GMERKE.ADHOC        Section: 1
     PTC state started: 1997-03-05  13:40:32
     Heuristic state started: 1997-03-05  14:05:07
     Damage: Unknown
ARI0065I Operator command processing is complete.

The heuristic state started time is updated with the time that the FORCE command was performed. The damage is set to unknown. If resynchronization recovery is performed after the DUOW in-doubt was forced, damage may result. If the in-doubt was committed and recovery asked commit to be done, then damage is update to no. Similarly, damage is no if the in-doubt was rolled back and recovery asked rollback to be done. However, damage is yes if the in-doubt was committed and recovery wanted rollback or if the in-doubt was rolled back and recovery wanted commit.

If a DUOW in-doubt was resolved manually, the entry in the SHOW INDOUBT display will remain there until it is removed with the RESET INDOUBT command. The RESET INDOUBT command should not be used until the database administrator is sure that the in-doubt has been correctly resolved at all of the participating sites of the DUOW.

Useful CRR Commands (Valid for VM only)

DB2 for VM uses CRR (Coordinated Resource Recovery) to manage distributed unit of work activity. When an in-doubt unit of work is created in DB2 for VM, the CRR recovery server will also have information regarding it. This information can be seen using the "CRR QUERY LUWID" CRR command. If the CRR QUERY LUWID command were to be issued for the in-doubt agent described above, the following information would be displayed on the CRR console:

    crr query luwid 178
    Time: 15:20:45                     CRR QUERY LUWID - VMSYSR
    Date: 03/06/97                     LUNAME - CAIBMOML.OECGW001
 
    LUWID                                                Token
 (1)CAIBMOML.OMXNV108.D6BD1D33FA93.0001                  00000178
    Name                                                 Process
 (2)SQLMACGM                                             RESYNCHRONIZATION PENDING
    Syncpoint Role            Syncpoint State            Status
    INITIATOR CASCADE         COMMITTED
    Transaction Tag
 (3)DB2 FOR VM  5.1.0  PACKAGE: GMERKE.ADHOC
 
    Initiator Name
    CAIBMOML.OMXNV108 SQLMACGM
       Recovery TPN
       .2
      '06F2'X
       Recovery Token                                    Index
       CAIBMOML.OMXNV108.20B5A76838B5AA31                 1  (4)
       Resync Role            Resync State               Access Userid
       RESYNC NEEDED          RESYNC NEEDED              SQLMACGM
 
    Resources
    *LOCAL SQLMACGM
       Recovery TPN
       SQLMACGM
       Recovery Token                                    Index
 (6)   000026CD                                           2  (5)
       Resync Role            Resync State               Access Userid
 (8)   RESYNC NEEDED          RESYNC NEEDED (7)            SQLMACGM
 
    DMS5BC3065I Operator command processing complete

Here are some note on how this information relates to the information displayed by SHOW INDOUBT at the DB2 for VM application server:

(1)
This is the LUWID of the in-doubt transaction. It should match the LUWID displayed on the SHOW INDOUBT command.

(2)
This is the RESID for the DB2 for VM Application Server. It indicates at which DB2 for VM server the SHOW INDOUBT command should be issued.

(3)
This is a transaction tag set up by DB2 for VM. It consists of the following information:

(4)
This index number indicates the first of two protected resources displayed by the CRR command. This refers to the protected conversation with the remote requester (and its sync point manager).
Note:The index number can be used with the "CRR RESYNC" CRR command to heuristically force this resource from CRR. See VM/ESA CMS File Pool Planning, Administration, and Operation for more details.

(5)
This index number indicates the second of two protected resources displayed by the CRR command. This refers to the resource at the DB2 for VM application server itself.
Note:The index number can be used with the "CRR RESYNC" CRR command to heuristically force this resource from CRR. See VM/ESA CMS File Pool Planning, Administration, and Operation (SC24-5751), for more details.

(6)
The recovery token is DB2 for VM's internal logical unit of work identifier. This should match the TRANID value displayed by SHOW INDOUBT. If this value is zero, then the unit of work at DB2 for VM was read-only.

(7)
This is the state of the unit of work according to CRR. A value of "RESYNC NEEDED" indicates that resynchronization recovery must still be done. When resynchronization recovery has completed successfully, this value changes to "COMMITTED" or "BACKOUT" depending on what was required.

(8)
This incates the role that CRR is taking on for this logical unit of work. A value of "RESYNC NEEDED" indicates that resynchronization has started. When resynchronization recovery has completed successfully, this value changes to "FORGET".

When resynchronization recovery has completed successfully at DB2 for VM, the CRR QUERY LUWID command will show information about this unit of work until resynchronization has completed with the requester's sync point manager. That is, when resource for "index 1" of the unit of work (as displayed by the CRR QUERY LUWID command) has been resynchronized, CRR will forget about this luwid.

The following CRR operator command may be used to manage activity at the CRR operator console:

CRR ERASE LU
Erases specified LU name and TPN entries from the CRR log name table

CRR ERASE LUWID
Erases CRR log records for a specified LUWID instance, which prevents any further CRR recovery server activity on this LUWID instance

CRR QUERY LOG
Displays the status of the CRR log minidisks

CRR QUERY LOGTABLE
Displays LU names and TPNs in the CRR log name table

CRR QUERY LU
Displays status of logical units of work known to this CRR recovery server and associated with the specified LU name

CRR QUERY LUWID
Displays status of sync point processing and resynchronization processing for an LUWID instance known to this CRR recovery server

CRR RESUME
Restarts the automatic periodic retry of resynchronization for a specified LUWID that was suspended by the CRR SUSPEND command and also bypasses the timed wait interval

CRR RESYNC
Provides a heuristic response for an unavailable protected resource or protected conversation so resynchronization can continue

CRR SUSPEND
Stops the automatic periodic retry of resynchronization for a specified LUWID until the CRR operator enters the CRR RESUME command

These CRR commands are discussed in chapter "CRR Administration" of the VM/ESA: CMS File Pool Planning, Administration, and Operation manual.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]