PQ90324: Throttle for the XA recovery work

 A fix is available

Obtain the fix for this APAR



APAR status
Closed as program error.

Error description
Servant region becomes unresponsive/hangs.

The dump shows that threads in CRs were suspended in
ORB_Request::comm_outbound_ctl_local_request()
and threads in SRs were suspended in BBOOSOUT (in CR address
space) which was called from
ORB_Request::comm_outbound_request()

Typical traceback is as follows:

- Controller Region -
  ORB_Request::comm_outbound_ctl_local_request()
  ORB_Request::comm_outbound_request()
  CORBA::Request::invoke()
  xarecovery::XARecoveryAgent_ORBProxy::commit(const
xarecovery::XID&,const xarecovery::_IDL

BBOT_Syncpoint_UWResolver::resolveXaResources(char,char,SEQUENCE
_octet*,BBOT_Syncpoint_UWRe
  BBOT_Syncpoint_UWResolver::resolve()
  BBOT_Syncpoint_UWResolver::transactionEventExit()
  BBOT_TransactionAlarm::TimedEventExit()
  Snoozer::invokeSnoozeAlarm()
  ACR_ExecutionThread::ProcessTimeoutEvent(acrwObj*)
  ACR_ExecutionThread::RemoveAndProcessWork(ThreadCleanUp*)
  ACR_ExecutionRoutine

- Servant Region -
  BBOOSOUT
  ORB_Request::comm_outbound_request()
  CORBA::Request::invoke()
  ORBEJSBridge::invoke_request(JNIEnv_*,bboojorb*,char*,unsigned
char,CORBA::Request*&,void*)

ORBEJSBridge::build_and_invoke_request(JNIEnv_*,bboojorb*,char*,
int,CORBA::Request*&)
  Java_com_ibm_ws390_orb_ClientDelegate_jorbInvokeRequest
  com/ibm/ws390/orb/ClientDelegate.jorbInvokeRequest(I.BIZI).B
  mmipSelectInvokeJavaMethod jorbInvokeRequest
  mmipSelectInvokeJavaMethod invoke

Other symptoms that may be seen are OTS timeouts.
Local fix Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server    *
*                 V5.0 for z/OS                                *
****************************************************************
* PROBLEM DESCRIPTION: When a workload using XA resources is   *
*                      being used, the WebSphere server may be *
*                      unable to process any work after a      *
*                      controller or servant region is         *
*                      recycled.                               *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
When a WebSphere controller or servant region is recycled, the
transaction service must recover with any XA resources used in
the address space which was recycled.  The XA recovery can run
on one or more threads, depending on how much recovery work
there is to do.  In this case, all of the threads in the
controller were processing XA recovery work, and each needed to
make a call to the servant region.  At the same time, the
servant region was taking on new work, and needed to contact
the controller to process naming requests.  A deadlock resulted
and the server could not process any of the work that was in
progress.
Problem conclusion
Code was added to the transaction service to ensure that the
XA recovery processing does not use every thread in the
controller region.  This will allow some threads to be free to
process other work within the controller, or work which needs
to be processed on behalf of its servant regions.  If there are
more XA recovery requests currently pending than the number of
threads available, the requests will be retried later when the
current XA recovery requests have been completed.

APAR PQ90324 is associated with SERVICE LEVEL W502013 of
WebSphere Application Server V5.0 for z/OS.
Temporary fix Comments
APAR information
APAR number PQ90324
Reported component name WEBSPHERE FOR Z
Reported component ID 5655I3500
Reported release 500
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2004-06-17
Closed date 2004-07-20
Last modified date 2004-08-04

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:
PQ90325

Modules/Macros
BBOUBINF          

Publications Referenced

Fix information
Fixed component name WEBSPHERE FOR Z
Fixed component ID 5655I3500

Applicable component levels
R500 PSY UQ90831    UP04/07/27 P F407

  Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.


Document Information


Current web document: swg1PQ90324.html
Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server for z/OS
Operating system(s):
Software version: 500
Software edition:
Reference #: PQ90324
IBM Group: Software Group
Modified date: Aug 4, 2004