PK29781: WLM ROUTING FAILS AFTER DMGR RESTART

 Fixes are available

5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for AIX
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for HP-UX
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for Linux
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for AIX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for AIX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for HP-UX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Solaris
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Windows
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Linux
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Linux
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Solaris
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Windows
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for AIX
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for Windows
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Windows
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Solaris
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Windows
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Solaris
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for AIX
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Linux
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for HP-UX
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for AIX
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Windows
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for HP-UX
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Linux
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for AIX
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Linux
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Windows
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Windows
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for HP-UX
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for AIX
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for HP-UX
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for HP-UX
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Linux
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Linux
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for HP-UX



APAR status
Closed as program error.

Error description
Description: The deployment manager has a set of reconnect logic
that fires off when the rest of the cell remains up while the
deployment manager is restarted. This logic was rewritten to be
multithreaded with 
PK16480, as the single threaded version would
block WLM routing until it completed. The multi threaded version
has a timing window in which reconnections will not happen when
a customer has nodes with clusters members from more than one
cluster on it.
Local fix Problem summary
****************************************************************
* USERS AFFECTED: Websphere Application Server Network         *
*                 Deployment users restarting their deployment *
*                 manager on versions 5.0.2.16+ and            *
*                 5.1.1.9-5.1.1.11                             *
****************************************************************
* PROBLEM DESCRIPTION: After the a Deployment Manager restart  *
*                      with the rest of the cell running,      *
*                      Workload Management (WLM) routing may   *
*                      not function.                           *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
In a clustered topology in WebSphere Application Server
Version5, when the deployment manager is restarted, JNDI
lookups of EJBs that are deployed to the cluster may fail with
the following error:
.
org.omg.CORBA.NO_IMPLEMENT: No Useable Targets  minor code:
1229066304  completed: No
at
com.ibm.ws.wlm.client.selection.WLMLSDRouter.getNextTarget(WLMLS
DRouter.java:214)
...
.
This occurs because when the deployment manager starts, there is
logic that tries to reconnect to all of the nodes and cluster
members within the cell.  With the additition of 
PK16480 this
logic was made multi-threaded, but internal and customer testing
did not reveal at the time an additional bug on this path only
found after 
PK16480 was applied.  If the customer has a single
node with cluster members from more than one cluster, only one
of those clusters will end up reconnected, which can cause the
above NO_IMPLEMENT error on some lookups, but not others.
Problem conclusion
The code was rewritten to ensure that all clusters will be
reconnected, regardless of the number of different clusters
there may be associated with one node.

The fix for this APAR is currently targeted for inclusion into
the 5.1.1.12 fixpack.
Please refer to the recommended updates page for delivery
information:

http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980

No more fixpacks are currently scheduled for version 5.0.2
so if a customer needs this fix they will need to acquire an
iFix through Level 2 support.
Temporary fix Comments
APAR information
APAR number PK29781
Reported component name WAS NETWRK DEPL
Reported component ID 5630A3601
Reported release 10A
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2006-08-11
Closed date 2006-08-28
Last modified date 2006-08-28

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Modules/Macros
WLM          

Publications Referenced

Fix information
Fixed component name WAS NETWRK DEPL
Fixed component ID 5630A3601

Applicable component levels
R003 PSY    UP
R00A PSY    UP
R00H PSY    UP
R00I PSY    UP
R00P PSY    UP
R00S PSY    UP
R00W PSY    UP
R103 PSY    UP
R10A PSY    UP
R10H PSY    UP
R10I PSY    UP
R10P PSY    UP
R10S PSY    UP
R10W PSY    UP


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > General
Operating system(s):
Software version: 10A
Software edition:
Reference #: PK29781
IBM Group: Software Group
Modified date: Aug 28, 2006