PQ74401: A DEADLOCK IN CM CODE THAT CAUSE LARGE AMOUNT THREADS IN CREATEORWAITFORCONNECTION() METHOD AND CAUSE APP SERVER HUNG.

 A fix is available

5.0.2: WebSphere Application Server Version 5.0 Fix Pack 2 (Version 5.0.2)



APAR status
Closed as program error.

Error description
The problem here is caused by a deadlock in the CM code.  A
transaction timeout thread is holding a lock on the ConnectO
object,
and waiting to get a lock on the ConnectionPool object.  At the
same
time, an execution thread is holding the lock on the
ConnectionPool
object, and waiting to get the lock on the CoonnectO object.
Because
the createOrWait method also needs to get the lock on the
ConnectionPool object, this is why a large number of threads are
in the
createOrWait method.
This will cause the application server eventually hung.
Local fix
Increase the transaction timeout to reduce the chance of the
deadlock.
Problem summary
****************************************************************
* USERS AFFECTED: WebSphere Application Server users running   *
*                 on a large SMP system, and seeing            *
*                 transaction timeouts in their log files.     *
****************************************************************
* PROBLEM DESCRIPTION: The application server hangs, and a     *
*                      java thread dump shows a deadlock       *
*                      between the servlet thread and the      *
*                      transaction timeout thread.             *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
Due to incorrect synchronization, a possible deadlock can
occur between a servlet processing thread and the transaction
timeout thread.  This causes the applications server to hang,
and no new connections can be requested/returned.  Users
can work around the problem by increasing their transaction
timeout value to avoid transaction timeouts.
Problem conclusion
Modified the synchronization in the CM code to avoid the
possible deadlock.  While this resolves the deadlock scenario,
the transaction timeouts which caused the problem will still
occur.  Users should investigate their transaction timeout
setting and increase it if neccessary, or improve the speed
of their transactions by tuning the database server, etc. to
resolve these messages.
Temporary fix Comments
APAR information
APAR number PQ74401
Reported component name WAS BASE 5.0
Reported component ID 5630A3600
Reported release 00W
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2003-05-21
Closed date 2003-05-21
Last modified date 2003-05-21

APAR is sysrouted FROM one or more of the following:
PQ69737

APAR is sysrouted TO one or more of the following:

Modules/Macros
JDBC          

Publications Referenced

Fix information
Fixed component name WAS BASE 5.0
Fixed component ID 5630A3600

Applicable component levels
R00W PSY    UP
R00S PSY    UP
R00A PSY    UP
R00I PSY    UP
R00H PSY    UP
R003 PSY    UP


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > General
Operating system(s):
Software version: 00W
Software edition:
Reference #: PQ74401
IBM Group: Software Group
Modified date: May 21, 2003