PQ95276: Abend 0C4 can occur due to ORB requests on the pending queue are no longer ORB requests.

 A fix is available

Obtain the fix for this APAR



APAR status
Closed as program error.

Error description
Customer has experienced series of abend 0C4s in their AppServer
and NodeAgent.  The abend occurred due to the ORB requests on
the pending queue are no longer ORB requests.  The ORB requests
were incorrectly freed by other threads using the same session
that is being freed by the abending thread.

Symptoms of the problem can be found from the traceback:

 -------- Traceback of TCB with ABEND 0C4 --------
 Function
 --------
 PendingORBRequestBucket::~PendingORBRequestBucket()
 PendingORBRequestBucket::__dftdt()
 __vec__delete2
 BBO_Hash<PendingORBRequestBucket>::~BBO_Hash()
 ClientLocalSession::uninitializeClientLocalSession()
 SessionManagerProtocolLocal::putOnSessionStack(Session*)
 SessionManager::putOnSessionStack(Session*)
 LocalSessionBucket::removeFromSessionCollisionQueue(const
 SessionHandle*)
 SessionManagerProtocolLocal::freeSession(SessionHandle*)
 SessionManager::freeSession(SessionHandle*)
 ORB_Request::comm_cr_sclt_locate_request(ORB_Request::Outbound_
L
 ocate_Status*)
 ORB_Request::comm_outbound_locate()
 CORBA::Request::Request(CORBA::Object_ORBProxy*,char*,unsigned
 long)
 ORBEJSBridge::create_request(JNIEnv_*,bboojorb*,char*,unsigned
 char,CORBA::Request*&,long)
 Java_com_ibm_ws390_orb_ClientDelegate_jorbCreateRequest
 com/ibm/ws390/orb/ClientDelegate.jorbCreateRequest(I[BZZLcom/ib
m
 /ws390/orb/RequestEncap;I)V
...
 xeRunJniMethod
 jni_CallStaticObjectMethodA
 ORBEJSBridge::CORBAinvoke(void*)
 threadDispatch(BOSS_Object_Key*,Internal_CORBA_Request&,ORB_Req
u
 est*)
 ACR_ExecutionThread::ProcessInboundRequest(acrwObj*,ThreadClean
U
 p*,BOSS_Object_Key&,Interna...
 ACR_ExecutionThread::RemoveAndProcessWork(ThreadCleanUp*)
 ACR_ExecutionRoutine
 CEEPGTFN
 CEEOPCMM
 ---------------------------------------------------

Additional abend 0C4 can occur due to recovery of previous
abend.

 This is due to the routine 'RestartThreadRtn' getting control
 to cleanup for a previous abend and deleting the same
 'ORB_Request' object twice.

 RestartThreadRtn:
  - calls cleanupForThreadCatchOrTermination   which issues
  return_ORB_request
  - calls ORB_Request::endOfThreadCleanup which issues
  return_ORB_request

 The ORB_Request is being deleted twice.
Local fix Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server    *
*                 V5.0 for z/OS                                *
****************************************************************
* PROBLEM DESCRIPTION: ABEND0C4 in the Controller processing   *
*                      ORB_Requests.                           *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
ABEND0C4 in PendingORBRequestBucket:: PendingORBRequestBucket()
in the Controller with the following stack trace:

PendingORBRequestBucket:: PendingORBRequestBucket()
PendingORBRequestBucket::__dftdt()
__vec__delete2
BBO_Hash<PendingORBRequestBucket>:: BBO_Hash()
ClientLocalSession::uninitializeClientLocalSession()
SessionManagerProtocolLocal::putOnSessionStack(Session*)
SessionManager::putOnSessionStack(Session*)
LocalSessionBucket::removeFromSessionCollisionQueue(
  const SessionHandle*)
SessionManagerProtocolLocal::freeSession(SessionHandle*)
SessionManager::freeSession(SessionHandle*)
ORB_Request::comm_cr_sclt_locate_request(
  ORB_Request::Outbound_Locate_Status*)
ORB_Request::comm_outbound_locate()
CORBA::Request::Request(
  CORBA::Object_ORBProxy*,char*,unsigned long)
ORBEJSBridge::create_request(
  JNIEnv_*,bboojorb*,char*,unsigned char,CORBA::Request*&,long)
Java_com_ibm_ws390_orb_ClientDelegate_jorbCreateRequest
...

The problem above was caused by another thread freeing
the ORB_Request while still on the outbound pending queue.


Also, ABENDS0C4 in ORB_Request::getSharedDataMemberCDRSeq with
the following stack trace:

RMCDP_Class::takeSVC_Dump(unsigned long)
ORBEyeCatcher::setUnallocated(char*,char*,unsigned int)
ORB_Request:: ORB_Request()
BBO_BOA::return_ORB_request(ORB_Request*)
ORB_Request::endOfThreadCleanup()
RestartThreadRtn
CleanUpList::call_cleanup_routines(btcb*)
CallThisThreadCleanUpRoutines(btcb*)
RasAtThreadExit
boss_thread_destructor
CEEPGTFN
CEEUCALL
CEEOXKTD
CEEOPE
pthread_exit
RasSignalHandler2
panicSignalHandler
sysSignalCatchHandler
userSignalHandler
intrDispatch
@@GETFN
__zerros
CEEHDSP
PendingORBRequestBucket:: PendingORBRequestBucket()
PendingORBRequestBucket::__dftdt()
__vec__delete2
BBO_Hash<PendingORBRequestBucket>:: BBO_Hash()
ClientLocalSession::uninitializeClientLocalSession()
SessionManagerProtocolLocal::putOnSessionStack(Session*)
SessionManager::putOnSessionStack(Session*)
LocalSessionBucket::removeFromSessionCollisionQueue(const S...)
SessionManagerProtocolLocal::freeSession(SessionHandle*)
SessionManager::freeSession(SessionHandle*)
ORB_Request::comm_cr_sclt_locate_request(ORB_Request::Outbo...)
ORB_Request::comm_outbound_locate()
CORBA::Request::Request(CORBA::Object_ORBProxy*,char*,unsig...)
ORBEJSBridge::create_request(JNIEnv_*,bboojorb*,char*,unsig...)
Java_com_ibm_ws390_orb_ClientDelegate_jorbCreateRequest
com/ibm/ws390/orb/ClientDelegate.jorbCreateRequest(I.BZZLco...)
mmipSelectInvokeJavaMethod
...
mmipSelectInvokeJavaMethod
INVOKDMY
EXECJAVA
mmipExecuteJava
xeRunJniMethod
jni_CallStaticObjectMethodA
ORBEJSBridge::CORBAinvoke(void*)
threadDispatch(BOSS_Object_Key*,Internal_CORBA_Request&,ORB...)
ACR_ExecutionThread::ProcessInboundRequest(acrwObj*,ThreadC...)
ACR_ExecutionThread::RemoveAndProcessWork(ThreadCleanUp*)
ACR_ExecutionRoutine

The problem immediately above was caused by a double
delete of the ORB_Request by thread recovery code.
Problem conclusion
Code has been modified in the exploiters of the Comm outbound
pending queue to not free the ORB_Request while it is still
queued.

Code has been modified in the thread recovery code to not
free the ORB_Request twice.

APAR PQ95276 requires a change to documentation.
________________________________________________________________
WebSphere Application Server V5 for z/OS
Messages and Codes
GA22-7915-01
_______________________________________________________________

NOTE: Periodically, we refresh the documentation on our
Web site, so the changes might have been made before you
read this text. To access the latest on-line
documentation, go to the product library page at:

www.ibm.com/software/webservers/appserv/zos_os390/library.html

________________________________________________________________
Chapter 3, pg. 125 (new message)
Message identifier - C9C21337
Explanation: A communication failure was detected while attempti
             to drive a locate request to
             the daemon.
Suggested Action: Check the error log for a communication failur
             error message.
________________________________________________________________
Chapter 3, pg. 125 (new message)
Message identifier - C9C21338
Explanation: A communication failure was detected while attempti
             to drive a locate request to
             the daemon.
Suggested Action: Check the error log for a communication failur
             error message.
________________________________________________________________
Chapter 3, pg. 125 (new message)
Message identifier - C9C21339
Explanation: A communication failure was detected while attempti
             to drive a locate request to
             the daemon.
Suggested Action: Check the error log for a communication failur
             error message.
________________________________________________________________
Chapter 3, pg. 125 (new message)
Message identifier - C9C2133A
Explanation: A communication failure was detected while attempti
             to drive an outbound request.
Suggested Action: Check the error log for a communication failur
             error message.
________________________________________________________________
Chapter 3, pg. 125 (new message)
Message identifier - C9C2133B
Explanation: A communication failure was detected while attempti
             to drive an outbound request.
Suggested Action: Check the error log for a communication failur
             error message.

APAR PQ95276 is associated with SERVICE LEVEL W502018 of
WebSphere Application Server V5.0 for z/OS.
Temporary fix Comments
APAR information
APAR number PQ95276
Reported component name WEBSPHERE FOR Z
Reported component ID 5655I3500
Reported release 500
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2004-10-04
Closed date 2004-11-12
Last modified date 2004-12-02

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:
PQ95737

Modules/Macros
BBOUBINF NONE        

Publications Referenced

Fix information
Fixed component name WEBSPHERE FOR Z
Fixed component ID 5655I3500

Applicable component levels
R500 PSY UQ95030    UP04/11/18 P F411

  Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.


Document Information


Current web document: swg1PQ95276.html
Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server for z/OS
Operating system(s):
Software version: 500
Software edition:
Reference #: PQ95276
IBM Group: Software Group
Modified date: Dec 2, 2004