Resolving collaboration deadlocks

A deadlock is a situation where two or more processes cannot proceed because each is waiting for the other processes to proceed. Deadlocks are an undesirable side effect of the concurrency control provided by event isolation within a collaboration. See the Collaboration Development Guide for more information on event isolation.

Figure 75 illustrates a deadlock between two active collaboration groups resulting from the following sequence of events:

  1. At time T1, Collaboration A1 receives an event, E1, then makes a service call to Collaboration B2 and sends a child business object for E1. Collaboration A1 waits for the service call to complete.
  2. At time T2, Collaboration B1 receives an event, E2, then makes a service call to Collaboration A2 and sends a child business object for E2. Collaboration A2 waits for the service call to complete.
  3. At time T3, Collaboration B2 is waiting for Collaboration B1, because B2 and B1 have the same port binding and the event from B1 was delivered before the event for B2 arrived.
  4. At time T4, Collaboration A2 is waiting for Collaboration A1, because A2 and A1 have the same port bindings and the event from A1 was delivered before the event for B1 arrived.

At this point, all collaborations cannot move forward.

Note:
A port binding consists of the business object type and connector name. See the Collaboration Development Guide for more information on port bindings.

Figure 75. Deadlock between collaboration groups

This section covers the following topics:

Steps for detecting a collaboration deadlock

Steps for detecting group collaboration deadlocks

Steps for fixing a collaboration deadlock

Steps for detecting a collaboration deadlock

By default, the InterChange Server Express system performs deadlock detection automatically when you start InterChange Server Express. However, when deadlock detection is performed, startup of InterChange Server Express can be delayed if a collaboration group contains many collaboration objects, because InterChange Server Express must traverse all the collaboration objects in the group to determine whether there is a deadlock in the group. This can cause slow startup even when no deadlock exists.

You can configure the InterChange Server Express system to not perform deadlock detection. If you do so, the system starts collaboration groups without first checking for deadlocks. This can make it possible for InterChange Server Express to boot more quickly. However, if deadlock detection is not performed and a deadlock exists, an event that is later sent to a collaboration may fail.

System Manager does not provide the ability to set the DEADLOCK_DETECTOR_CHECK configuration parameter. Instead, to set this configuration parameter, you must edit the InterchangeSystem.cfg file and change the parameter's value in this file.

Perform the following steps to configure the InterChange Server Express system for deadlock detection:

  1. Open the InterchangeSystem.cfg file. The following lines define the DEADLOCK_DETECTOR_CHECK parameter in the file:

    <tns:name>DEADLOCK_DETECTOR_CHECK</tns:name>

    <tns:value xml:space="preserve">false</tns:name>

  2. If you want deadlock detection performed, change the false value to true.
  3. If you do not want deadlock detection performed, change the value to false.

Steps for detecting group collaboration deadlocks

Perform one of the following steps to check for a group collaboration deadlock:

Steps for fixing a collaboration deadlock

Perform the following steps to fix a collaboration deadlock:

  1. Gracefully shut down all other collaborations.
  2. Shut down the server immediately. See Shutting down InterChange Server Express for more information on shutting down the InterChange Server Express system.
  3. Restart the system. Upon system restart, a stopped collaboration that caused the deadlock automatically starts and resubscribes to all of the business objects it supports. The business objects that caused the collaborations to enter into a deadlock are redelivered. Because deadlocks are timing-dependent, the collaborations are prevented from entering into another deadlock. It is unlikely that you have the same server load and isolation sequencing that you had when your system encountered the deadlock.
  4. After restarting the system, shut down the collaborations involved and rebind the ports so that this does not occur again.

Steps for preventing collaboration deadlocks

You can prevent collaboration deadlocks by configuring the deadlock retry settings in the Database tab of server configuration screen in System Manager.

Perform the following steps to configure the deadlock retry mechanism:

  1. In System Manager, right-click the server in the InterChange Server Component Management view and click Edit Configuration. The editing tool opens, in which you can edit the InterchangeSystem.cfg file.
  2. Click the Database tab. A dialog box appears in which you can enter the parameters necessary for configuring database configuration at the system level (see Figure 76).
    Figure 76. Edit Configuration screen, Database tab
  3. In the Max database retry field, type a number that represents the maximum number of retries you want the server to perform if a deadlock occurs.
  4. In the Deadlock retry interval field, type a number that represents the number of seconds you want the system to wait before retrying.

Copyright IBM Corp. 2004, 2005