PK16066: RANDOMLY, APPLICATION SERVERS ARE NOT BEING RESTARTED DURING NODEAGENT START, AS PER THE MONITORING POLICY SETTING.

 Fixes are available

5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for AIX
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for HP-UX
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for Linux
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for AIX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for AIX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for HP-UX
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Solaris
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Windows
5.1.1.18: WebSphere Application Server V5.1.1 Cumulative Fix 18 for Linux
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Linux
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Solaris
5.1.1.17: WebSphere Application Server V5.1.1 Cumulative Fix 17 for Windows
5.0.2.17: WebSphere Application Server 5.0.2 Cumulative Fix 17 for Solaris
5.0.2.17: WebSphere Application Server 5.0.2 Cumulative Fix 17 for Windows
5.1.1.10: WebSphere Application Server V5.1.1 Cumulative Fix 10 for Windows
5.1.1.10: WebSphere Application Server V5.1.1 Cumulative Fix 10 for AIX
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for AIX
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for Windows
5.1.1.9: WebSphere Application Server V5.1.1 Cumulative Fix 9 for Solaris
5.1.1.9: WebSphere Application Server V5.1.1 Cumulative Fix 9 for AIX
5.1.1.9: WebSphere Application Server V5.1.1 Cumulative Fix 9 for Windows
5.0.2.17: WebSphere Application Server 5.0.2 Cumulative Fix 17 for HP-UX
5.0.2.17: WebSphere Application Server 5.0.2 Cumulative Fix 17 for AIX
5.1.1.11: WebSphere Application Server V5.1.1 Cumulative Fix 11 for AIX
5.0.2.17: WebSphere Application Server 5.0.2 Cumulative Fix 17 for Linux
5.1.1.10: WebSphere Application Server V5.1.1 Cumulative Fix 10 for HP-UX
5.1.1.10: WebSphere Application Server V5.1.1 Cumulative Fix 10 for Linux
5.1.1.9: WebSphere Application Server V5.1.1 Cumulative Fix 9 for HP-UX
5.1.1.9: WebSphere Application Server V5.1.1 Cumulative Fix 9 for Linux
5.0.2.16: WebSphere Application Server 5.0.2 Cumulative Fix 16 for HP-UX
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Windows
5.0.2.16: WebSphere Application Server 5.0.2 Cumulative Fix 16 for Solaris
5.0.2.16: WebSphere Application Server 5.0.2 Cumulative Fix 16 for Windows
5.0.2.16: WebSphere Application Server 5.0.2 Cumulative Fix 16 for AIX
5.1.1.11: WebSphere Application Server V5.1.1 Cumulative Fix 11 for Windows
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Solaris
5.0.2.18: WebSphere Application Server 5.0.2 Cumulative Fix 18 for Solaris
5.1.1.11: WebSphere Application Server V5.1.1 Cumulative Fix 11 for Linux
5.0.2.18: WebSphere Application Server 5.0.2 Cumulative Fix 18 for Windows
5.0.2.18: WebSphere Application Server 5.0.2 Cumulative Fix 18 for HP-UX
5.0.2.18: WebSphere Application Server 5.0.2 Cumulative Fix 18 for AIX
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Windows
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Solaris
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for AIX
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Linux
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for HP-UX
5.1.1.12: WebSphere Application Server V5.1.1 Cumulative Fix 12 for Solaris
5.1.1.11: WebSphere Application Server V5.1.1 Cumulative Fix 11 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for AIX
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Windows
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for HP-UX
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Solaris
5.1.1.13: WebSphere Application Server V5.1.1 Cumulative Fix 13 for Linux
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for AIX
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Linux
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for Windows
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Windows
5.0.2.18: WebSphere Application Server 5.0.2 Cumulative Fix 18 for Linux
5.1.1.11: WebSphere Application Server V5.1.1 Cumulative Fix 11 for HP-UX
5.1.1.14: WebSphere Application Server V5.1.1 Cumulative Fix 14 for HP-UX
5.0.2.16: WebSphere Application Server 5.0.2 Cumulative Fix 16 for Linux
5.1.1.10: WebSphere Application Server V5.1.1 Cumulative Fix 10 for Solaris
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for AIX
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for HP-UX
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for HP-UX
5.1.1.16: WebSphere Application Server V5.1.1 Cumulative Fix 16 for Linux
5.1.1.15: WebSphere Application Server V5.1.1 Cumulative Fix 15 for Linux
5.1.1.19: WebSphere Application Server V5.1.1 Cumulative Fix 19 for HP-UX



APAR status
Closed as program error.

Error description
When the nodeagent starts, all servers with a monitor state
of RUNNING or PREVIOUS are verified if they should be started.
If they should be started, the nodeagent will check to see if
it can find a valid pid.  If the pid exist the nodeagent will
not start the server.  This has problems when the client
shutdown the whole WebSphere system (stopNode -stopservers).
Now if the nodeagent finds the server pid, we know this is not
the server so it should should still start the process.  This
fix addresses the problems left from 
PK02127 to make sure
this work 100%.
Local fix Problem summary
****************************************************************
* USERS AFFECTED: Websphere Application Server users with a    *
*                 Network Deployment setup                     *
****************************************************************
* PROBLEM DESCRIPTION: Randomly, the application servers are   *
*                      not being restarted during nodeagent    *
*                      start, as per the monitoring policy     *
*                      setting.                                *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
The failure occurs when the pids are resused across system(OS)
startups. Basically, the pids for each server are stored in the
monitor.state file as a part of process monitoring.
During nodeagent restart, it reads the pid for each server
from monitor.state and checks if any process exists with
the given pid. If it finds the one, it adopts the
corresponding process and starts monitoring it. This could
lead to a problem if the Operating System allocates the old
pid to a new process after a machine reboot. The nodeagent now
has no idea whether the process associated with the given pid
is indeed an application server process and therefore does not
make any attempt to start the application server.
Problem conclusion
1) It is recommended to completely shutdown the node before
   rebooting the machine using the following command
   stopNode.sh -stopservers.
2) Also, the code has been modified to not to persist the pid
   values in monitor.state,in case the stopNode command is
   called.

The fix for this APAR is currently targeted for inclusion
in cumulative fix 5.02.16 & 5.1.1.9.
Please refer to the recommended updates page for delivery
information:

http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix Comments
APAR information
APAR number PK16066
Reported component name WAS NETWRK DEPL
Reported component ID 5630A3601
Reported release 10W
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Special Attention NoSpecatt
Submitted date 2005-12-01
Closed date 2006-01-06
Last modified date 2006-02-05

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Modules/Macros
ADMIN          

Publications Referenced

Fix information
Fixed component name WAS NETWRK DEPL
Fixed component ID 5630A3601

Applicable component levels
R003 PSY    UP
R00A PSY    UP
R00H PSY    UP
R00I PSY    UP
R00P PSY    UP
R00S PSY    UP
R00W PSY    UP
R103 PSY    UP
R10A PSY    UP
R10H PSY    UP
R10I PSY    UP
R10P PSY    UP
R10S PSY    UP
R10W PSY    UP


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > General
Operating system(s):
Software version: 10W
Software edition:
Reference #: PK16066
IBM Group: Software Group
Modified date: Feb 5, 2006