PK20881: APPLICATION SERVER HANGS DUE TO DEADLOCK WHEN MULTIPLE APPLICATION SERVERS ARE INVOLVED IN A LONG-RUNNING TRANSACTION | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() APAR status Closed as program error. Error description In WebSphere Application Server V5.1.x, an application server may hang due to a Java deadlock. The threads involved in the deadlock can be seen in a thread dump: . "ORB.thread.pool : 3": waiting to lock monitor 0x000f1720 (object 0xe7b7cda0, a com.ibm.ws.Transaction.JTA.TransactionImpl), which is held by "Alarm : 2" "Alarm : 2": waiting to lock monitor 0x000f1838 (object 0xe7b7d168, a com.ibm.ws.Transaction.JTS.TransactionWrapper), which is held by "ORB.thread.pool : 3" . "ORB.thread.pool : 3": at com.ibm.ws.Transaction.JTA.TransactionImpl.addAssociation(Transa ctionImpl.java:2673) - waiting to lock <0xe7b7cda0> (a com.ibm.ws.Transaction.JTA.TransactionImpl) at com.ibm.ws.Transaction.JTS.TransactionWrapper.rollback(Transacti onWrapper.java:548) - locked <0xe7b7d168> (a com.ibm.ws.Transaction.JTS.TransactionWrapper) at com.ibm.ws.Transaction.JTS.WSCoordinatorImpl.rollback(WSCoordina torImpl.java:163) ... . "Alarm : 2": at com.ibm.ws.Transaction.JTS.TransactionWrapper.destroy(Transactio nWrapper.java:841) - waiting to lock <0xe7b7d168> (a com.ibm.ws.Transaction.JTS.TransactionWrapper) at com.ibm.ws.Transaction.JTA.TransactionImpl.forgetTransaction(Tra nsactionImpl.java:2528) at com.ibm.ws.Transaction.JTA.TransactionImpl.notifyCompletion(Tran sactionImpl.java:2507) - locked <0xe7b7cda0> (a com.ibm.ws.Transaction.JTA.TransactionImpl) at com.ibm.ws.Transaction.JTA.TransactionImpl.rollback(TransactionI mpl.java:1176) ... . This may occur when one application server (referred to as the superior server) starts a transaction and then asks another application server (referred to as the subordinate server) to perform some work on the transaction. When the subordinate server finishes its work, it will inform the superior server and wait for a response. It will wait for the "Client inactivity timeout" number of seconds (which is 60 seconds by default) for a response. When the Client inactivity timeout is reached without a response from the superior server, the subordinate server will attempt to timeout and rollback the transaction. The problem occurs when shortly after this, the superior server sends a request to the subordinate server to rollback the transaction. This results in two threads on the subordinate server trying to initiate a rollback of the same transaction at the same time, resulting in the deadlock.Local fix The problem can be avoided if the "Client inactivity timeout" on the subordinate server is set to a value higher than the "Total transaction lifetime timeout" on the superior server. As a result of this, the superior server will timeout transactions before the subordinate server, so the deadlock condition cannot occur on the subordinate server.Problem summary **************************************************************** * USERS AFFECTED: This problem affects user of the Java * * Transaction Sevice provided with * * WebSphere Application Server Version * * 5.1.1.x. * **************************************************************** * PROBLEM DESCRIPTION: Transactions can span multiple * * application servers. The application * * server which initiates the transaction * * is known as the Superior server, * * while any application servers that are * * asked to participate in the transaction * * are known as Subordinate servers. * * * * After a Subordinate has completed it's * * work in a transaction, it waits for a * * response from the Superior. The amount * * of time the Subordiante server waits * * for this response is specified by the * * "Client inactivity timeout" property. * * If this timeout occurs, the * * Subordinate server will notify the * * Superior server, and roll back the * * work it has performed. * * * * If the "Client inactivity timeout" * * occurs at the same time as the * * Superior server issues a rollback * * request, a deadlock occurs on the * * Subordinate server, resulting in * * hung ORB threads. * **************************************************************** * RECOMMENDATION: * **************************************************************** When the problem occurred, there were two threads trying to roll back the same transaction on the Subordinate server. The first thread was using a TransactionImpl object, but was blocked waiting for a TransactionWrapper object to become available so that it could call TransactionWrapper.destroy(). However, the second thread was using this TransactionWrapper, and was waiting for the TransactionImpl object used by the first thread to become free.Problem conclusion The TransactionWrapper.destroy() method has been changed so that it is no longer synchronized. This allows the first thread to continue it's work - when it has finished processing the transaction rollback, the second thread is free to continue. The fix for this APAR is currently targeted for inclusion in Cumulative Fix 11 for WebSphere Application Server Version 5.1.1. Please refer to the Recommended Updates page for delivery dates: http://www-1.ibm.com/support/ docview.wss?rs=180&context=SSEQTP&uid=swg27004980Temporary fix Comments
APAR is sysrouted FROM one or more of the following: APAR is sysrouted TO one or more of the following: PK22540 Modules/Macros
Publications Referenced
|
Product categories: Software > Application Servers >
Distributed Application & Web Servers > WebSphere Application
Server > General
Operating system(s):
Software version: 10A
Software edition:
Reference #: PK20881
IBM Group: Software Group
Modified date: Apr 3, 2006
(C) Copyright IBM Corporation 2000, 2008. All Rights Reserved.