Resolving collaboration deadlocks

A deadlock is a situation where two or more processes are unable to proceed because each is waiting for the other processes to proceed. Deadlocks are an undesirable side effect of the concurrency control provided by event isolation within a collaboration. See the Collaboration Development Guide for more information on event isolation.

Figure 16 illustrates a deadlock between two active collaboration groups resulting from the following sequence of events:

  1. At time T1, Collaboration A1 receives an event, E1, then makes a service call to Collaboration B2 and sends a child business object for E1. Collaboration A1 waits for the service call to complete.
  2. At time T2, Collaboration B1 receives an event, E2, then makes a service call to Collaboration A2 and sends a child business object for E2. Collaboration A2 waits for the service call to complete.
  3. At time T3, Collaboration B2 is waiting for Collaboration B1, since B2 and B1 have the same port binding and the event from B1 was delivered before the event for B2 arrived.
  4. At time T4, Collaboration A2 is waiting for Collaboration A1, since A2 and A1 have the same port bindings and the event from A1 was delivered before the event for B1 arrived.

At this point, all collaborations are unable to move forward.

Note:
A port binding consists of the business object type and connector name. See the Collaboration Development Guide for more information on port bindings.

Figure 16. Deadlock between collaboration groups


This section covers the following topics:

"Detecting a collaboration deadlock"

"Detecting group collaboration deadlocks"

"Fixing a collaboration deadlock"

Detecting a collaboration deadlock

You can configure the IBM WebSphere ICS system to either perform or not perform deadlock detection:

Detecting group collaboration deadlocks

You can check for a group collaboration deadlock in one of the following ways:

Fixing a collaboration deadlock

If the WebSphere ICS system encounters a deadlock, you must shut down and restart InterChange Server. First, gracefully shut down all other collaborations, then shut down the server immediately.

Upon system restart, a hung collaboration that caused the deadlock automatically starts and resubscribes to all of the business objects it supports. The business objects that caused the collaborations to enter into a deadlock are redelivered. At this time, the collaborations should not enter into another deadlock because deadlocks are timing dependent. It is unlikely that you will have the exact same server load and isolation sequencing that you had when your system encountered the deadlock.

After restarting the system, shut down the collaborations involved and rebind the ports so that this does not occur again.

Preventing collaboration deadlocks

You can prevent collaboration deadlocks by configuring the deadlock retry settings in the Database tab of server configuration screen in System Manager. To configure the deadlock retry mechanism, do the following:

  1. From System Manager, right-click the server under Server Instances, then select Edit Configuration. The upper-right section of the System Manager window becomes a tool in which you can edit the InterchangeSystem.cfg file.
  2. Click the Database tab. A dialog box appears in the upper-right section of the System Manager window in which you can enter the parameters necessary for configuring database configuration at the system level (see Figure 17).
  3. In the "Max database retry" field, enter a number that represents the maximum number of retries you want the server to perform if a deadlock occurs.
  4. In the "Deadlock retry interval" field, enter a number that represents the number of seconds you want the system to wait before retrying.

Figure 17. Edit Configuration screen, Database tab


Copyright IBM Corp. 1997, 2004