PQ69737: A DEADLOCK IN CM CODE THAT CAUSE LARGE AMOUNT THREADS IN CREATEORWAITFORCONNECTION() METHOD AND CAUSE APP SERVER HUNG.

 Fixes are available

4.0.6: WebSphere Application Server Version 4.0 Fix Pack 6
WebSphere Application Server Connection Manager Cumulative Fix
4.0.2-4.0.7: Component cumulative Connection Manager fix



APAR status
Closed as program error.

Error description
The problem here is caused by a deadlock in the CM code.  A
transaction timeout thread is holding a lock on the ConnectO
object,
and waiting to get a lock on the ConnectionPool object.  At the
same
time, an execution thread is holding the lock on the
ConnectionPool
object, and waiting to get the lock on the CoonnectO object.
Because
the createOrWait method also needs to get the lock on the
ConnectionPool object, this is why a large number of threads are
in the
createOrWait method.
This will cause the application server eventually hung.
Local fix
Increase the transaction timeout to reduce the chance of the
deadlock.
Problem summary
****************************************************************
* USERS AFFECTED: WebSphere Application Server users running   *
*                 on a large SMP system, and seeing            *
*                 transaction timeouts in their log files.     *
****************************************************************
* PROBLEM DESCRIPTION: The application server hangs, and a     *
*                      java thread dump shows a deadlock       *
*                      between the servlet thread and the      *
*                      transaction timeout thread.             *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
Due to incorrect synchronization, a possible deadlock can
occur between a servlet processing thread and the transaction
timeout thread.  This causes the applications server to hang,
and no new connections can be requested/returned.  Users
can work around the problem by increasing their transaction
timeout value to avoid transaction timeouts.
Problem conclusion
Modified the synchronization in the CM code to avoid the
possible deadlock.  While this resolves the deadlock scenario,
the transaction timeouts which caused the problem will still
occur.  Users should investigate their transaction timeout
setting and increase it if neccessary, or improve the speed
of their transactions by tuning the database server, etc. to
resolve these messages.
Temporary fix Comments
APAR information
APAR number PQ69737
Reported component name WEBSPHERE AE AI
Reported component ID 5630A2200
Reported release 400
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2003-01-10
Closed date 2003-01-24
Last modified date 2003-05-21

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:
PQ74401

Modules/Macros
JDBC          

SRLS

Fix information
Fixed component name WEBSPHERE AE AI
Fixed component ID 5630A2200

Applicable component levels
R400 PSY    UP


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > General
Operating system(s):
Software version: 400
Software edition:
Reference #: PQ69737
IBM Group: Software Group
Modified date: May 21, 2003