PK04318: KILLING NODEAGENT & SERVERS SIMULTAENOUSLY: WLM DOES NOT FAILBACK PROPERLY. | |||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||
APAR status Closed as program error. Error description The following steps describe the error scenario: Machine 1: J2EE client application Machine 2: Appserver 1 Machine 3: Appserver 2 Appservers 1 and 2 are clustered. 1) Bring up all systems in the cluster with a clean start. 2) Start the client in a view that requires RMI calls to server EJB's 3) Allow the client to quiesce. 4) Kill the Java processes on machine 1 5) Perform an action on the client requiring RMI calls to server EJB's. This should suceed. 6) Reboot machine 1 and restart the nodeagent and servers on it. 7) Kill the Java processes on machine 2 8) Repeat the action in 5 -- This fails and a restart of the client is required to continue. The exception which might result is the following: org.omg.CORBA.TRANSIENT: java.net.SocketException: Operation timed out: connect:could be due to invalid address:host=wctnd006.notesdev.ibm.com,port=9900 vmcid: IBM minor code: E02 completed: No The problem occurs because the epoch on the cluster descriptions are not being updated when (after being killed) a node agent and its server are restarted.Local fix Problem summary **************************************************************** * USERS AFFECTED: WebSphere Application Server users of WLM, * * WorkloadManagement, or Clustering * **************************************************************** * PROBLEM DESCRIPTION: When killing nodeagents and servers * * SIMULTAENOUSLY, WLM does not recover * * and fail back to the servers when * * they are restarted * **************************************************************** * RECOMMENDATION: * **************************************************************** when killing nodeagents and servers SIMULTAENOUSLY, WLM does not recover and fail back to the servers, when they are restarted. The problem occurs because the epoch on the cluster descriptions are not being updated when (after being killed) a node agent and its server are restarted.Problem conclusion Fixed this by ensuring the epoch is changed. The fix for this APAR is currently targeted for inclusion fixpack WBI 5.1.1.2, and is an iFix only for 5.0.2.X PMETemporary fix Comments
APAR is sysrouted FROM one or more of the following: APAR is sysrouted TO one or more of the following: Modules/Macros
Publications Referenced
|
Product categories: Software > Application Servers >
Distributed Application & Web Servers > WebSphere Application
Server > Enterprise Edition (EE)
Operating system(s):
Software version: 00A
Software edition:
Reference #: PK04318
IBM Group: Software Group
Modified date: Apr 29, 2005
(C) Copyright IBM Corporation 2000, 2008. All Rights Reserved.