PK05786: TIMEOUT IN DRS CODE

 A fix is available

Obtain the fix for this APAR



APAR status
Closed as program error.

Error description
Timeout (EC3 abend with RC=04130007) as a result of a hang
thread.

Timeout occurred for the following thread:
condWait
sysMonitorWait
lkMonitorEnter
com/ibm/ws/drs/DRSPools.getUpdJMS
com/ibm/ws/drs/DRSCacheApp.getUpdJMS
com/ibm/ws/drs/DRSJMS.jmsPubUpd
com/ibm/ws/drs/DRSCacheApp.jmsPubUpd
com/ibm/ws/drs/DRSAPI.createEntry
com/ibm/ws/drs/DRSCacheApp.createEntry
com/ibm/ws/webcontainer/httpsession/DRSHttpSessCache.createEntry
com/ibm/ws/webcontainer/httpsession/DRSBackedHashtable.ejbCreate

Jformat shows this thread was waiting for object java/util/Linke
dList held by the following thread:

condWait
sysMonitorWait
lkMonitorWait
JVM_MonitorWait
java/lang/Object.wait
java/lang/Object.wait
com/ibm/disthub/impl/jms/SessionDispatcher.stop
com/ibm/disthub/impl/jms/SessionImpl.stop
com/ibm/disthub/impl/jms/SessionImpl.close
com/ibm/disthub/impl/jms/TopicSessionImpl.close
com/ibm/disthub/impl/jms/TopicSessionImpl.close
com/ibm/ws/drs/DRSCloseJMS.closeAccJMS
com/ibm/ws/drs/DRSCacheApp.closeAccJMS
com/ibm/ws/drs/DRSCacheApp.closeJMSClient
com/ibm/ws/drs/DRSResetJMS.resetJMS
com/ibm/ws/drs/DRSCacheApp.resetJMS
com/ibm/ws/drs/JMSSessPoolWrapper.onException
...

and this thread is in turn waiting for the following object
shown as <unowned): com/ibm/disthub/impl/jms/SessionDispatcher
Local fix Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server    *
*                 V5.0 for z/OS                                *
****************************************************************
* PROBLEM DESCRIPTION: Fixes a problem with 'get' operations   *
*                      in the DRS (Data Replication Service)   *
*                      component.                              *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
This fixes an intermittent problem with 'get' operations
in the DRS (Data Replication Service) component .  An
internal object with communication resources was not being
properly synchronized.  As  a result, it could be incorrectly
shared by multiple threads, and messages would not be properly
received.  This problem could be experienced by all WebSphere
Application Server components which use the services of DRS,
including HttpSession memory-to-memory replication.
Problem conclusion
The problem for which this APAR was opened corresponds to
distributed APAR 
PK00049.  In order to port the code for 
PK00049
to z/OS, it was necessary to roll up the z/OS DRS component code
to the latest DRS code available for distributed systems (level
cf40516.05).  Recent severity 1 APARs 
PK02412 and 
PK02464 are
also included.

The following list the defects included in the Rollup:

Defect       Abstract


PQ79493      Data replication between peer Session Manager
             caches or Dynamic caches breaks if the connected
             replicator is brought down and is unable to
             recover.


PQ90721      When remote replicators are configured, customer
             is seeing Java.lang.IllegalArgumentException


PQ91090      DRSw0001e and DRSw0005i are repeatedly seen with
             Java.io.ioexception: broken pipe.


PQ91134      Correct the order of replicators chosen by the
             reset function.


PQ91650      Message larger than max_message_size should be
             discarded instead of replicated, due to connection
             reset and exceptions.


PQ97151      Deadlock in DRSResetJMS.


PQ97924      Session replication function is not working.


PK00049      Fixes a problem in the DRS (Data Replication
             Service) component with  'get' operations

PK02412      SESSIONID is reused even though JVM parameter
             "useinvalidatedid=false" is specified for
             WebSphere Session Manager


PK02464      Data not replicated due to replicator being marked
             down by pingPeer function

PK05786      Timeout (EC3 abend with RC=04130007) as a result
             of a hung thread.

APAR PK05786 is associated with SERVICE LEVEL W502032 of
WebSphere Application Server V5.0 for z/OS.
Temporary fix Comments
APAR information
APAR number PK05786
Reported component name WEBSPHERE FOR Z
Reported component ID 5655I3500
Reported release 500
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2005-05-13
Closed date 2005-07-25
Last modified date 2005-08-02

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Modules/Macros
BBOUBINF          

Publications Referenced

Fix information
Fixed component name WEBSPHERE FOR Z
Fixed component ID 5655I3500

Applicable component levels
R500 PSY UK05697    UP05/07/29 P F507

  Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.


Document Information


Current web document: swg1PK05786.html
Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server for z/OS
Operating system(s):
Software version: 500
Software edition:
Reference #: PK05786
IBM Group: Software Group
Modified date: Aug 2, 2005