Resolving collaboration deadlocks
A deadlock is a situation where two or more processes
cannot proceed because each is waiting for the other processes to
proceed. Deadlocks are an undesirable side effect of the concurrency
control provided by event isolation within a collaboration. See
the Collaboration Development Guide for
more information on event isolation.
Figure 75 illustrates a deadlock
between two active collaboration groups resulting from the following sequence
of events:
- At time T1, Collaboration A1 receives an event, E1, then makes
a service call to Collaboration B2 and sends a child business object
for E1. Collaboration A1 waits for the service call to complete.
- At time T2, Collaboration B1 receives an event, E2, then makes
a service call to Collaboration A2 and sends a child business object
for E2. Collaboration A2 waits for the service call to complete.
- At time T3, Collaboration B2 is waiting for Collaboration B1,
because B2 and B1 have the same port binding and the event from
B1 was delivered before the event for B2 arrived.
- At time T4, Collaboration A2 is waiting for Collaboration A1,
because A2 and A1 have the same port bindings and the event from
A1 was delivered before the event for B1 arrived.
At this point, all collaborations cannot move forward.
Note:
A port binding consists of the business object type
and connector name. See the Collaboration Development Guide for
more information on port bindings.
Figure 75. Deadlock between collaboration groups
This section covers the following topics:
Steps
for detecting a collaboration deadlock
Steps for detecting
group collaboration deadlocks
Steps
for fixing a collaboration deadlock
Steps
for detecting a collaboration deadlock
By default, the InterChange Server Express system performs
deadlock detection automatically when you start InterChange Server
Express. However, when deadlock detection is performed, startup
of InterChange Server Express can be delayed if a collaboration
group contains many collaboration objects, because InterChange Server
Express must traverse all the collaboration objects in the group
to determine whether there is a deadlock in the group. This can
cause slow startup even when no deadlock exists.
You can configure the InterChange Server Express system to not
perform deadlock detection. If you do so, the system starts collaboration
groups without first checking for deadlocks. This can make it possible
for InterChange Server Express to boot more quickly. However, if
deadlock detection is not performed and a deadlock exists, an event
that is later sent to a collaboration may fail.
System Manager does not provide the ability to set the DEADLOCK_DETECTOR_CHECK configuration parameter. Instead, to set this configuration
parameter, you must edit the InterchangeSystem.cfg file and change the parameter's value in this file.
Perform the following steps to configure the InterChange Server
Express system for deadlock detection:
- Open the InterchangeSystem.cfg file. The following lines define the DEADLOCK_DETECTOR_CHECK parameter in the file:
<tns:name>DEADLOCK_DETECTOR_CHECK</tns:name>
<tns:value xml:space="preserve">false</tns:name>
- If you want deadlock detection performed, change the false value to true.
- If you do not want deadlock detection performed, change the
value to false.
Steps for detecting
group collaboration deadlocks
Perform one of the following steps to check for a group
collaboration deadlock:
- In System Manager, right-click the running group collaboration
and click Diagnostics.
A window appears with the following message:
The following diagnostic tests were run on this collaboration:
This message is followed by one of the following results:
- Check the InterchangeSystem.log file for the following error at the time the hanging collaborations
were started:
Error 11135: Activation of collaboration collaboration_name group could cause a potential deadlock with one or more existing collaboration groups, and is therefore disallowed.
This error warns only of a potential deadlock situation. The
informational messages preceding error 11135 identify the active
collaboration groups that potentially enter into a deadlock.
Steps
for fixing a collaboration deadlock
Perform the following steps to fix a collaboration deadlock:
- Gracefully shut down all other collaborations.
- Shut down the server immediately. See Shutting down InterChange Server
Express for more information on shutting down the InterChange
Server Express system.
- Restart the system. Upon system restart, a stopped collaboration
that caused the deadlock automatically starts and resubscribes to
all of the business objects it supports. The business objects that
caused the collaborations to enter into a deadlock are redelivered.
Because deadlocks are timing-dependent, the collaborations are prevented
from entering into another deadlock. It is unlikely that you have
the same server load and isolation sequencing that you had when
your system encountered the deadlock.
- After restarting the system, shut down the collaborations involved
and rebind the ports so that this does not occur again.
Steps
for preventing collaboration deadlocks
You can prevent collaboration deadlocks by configuring
the deadlock retry settings in the Database tab of server configuration
screen in System Manager.
Perform the following steps to configure the deadlock retry mechanism:
- In System Manager, right-click the server in the InterChange
Server Component Management view and click Edit Configuration.
The editing tool opens, in which you can edit the InterchangeSystem.cfg file.
- Click the Database tab. A dialog box appears
in which you can enter the parameters necessary for configuring
database configuration at the system level (see Figure 76).
Figure 76. Edit Configuration screen, Database tab
- In the Max database retry field, type
a number that represents the maximum number of retries you want the
server to perform if a deadlock occurs.
- In the Deadlock retry interval field,
type a number that represents the number of seconds you want the system
to wait before retrying.
