APAR status
Closed as program error.
Error description
The WebSphere process in OS run time by /etc/inittab starts the
process, when you kill using rmitab or edit the /etc/inittab
followed by "init q" termination signal only goes to the script
generated (example:nxeStartNode) and NOT to the JAVA process).
customer states script generated by start node, starts server
and start manager comes on, should use exec for the last command
in the generated script otherwise they can not use it under
/etc/inittab, needs fix because of this problem, when disabling
process from init tab the process does not receive the
termination signal
Example Script generated:
[root@waslinux bin]# ./startManager.sh -script nxeStartManager
ADMU0116I: Tool information is being logged in file
/opt/WebSphere/DeploymentManager/logs/dmgr/startServer.log
ADMU3100I: Reading configuration for server: dmgr
ADMU3300I: Launch script for server created: nxeStartManager
In the mean time the solution provided by you does not work.
i.e. if we remove an entry from the /etc/inittab by either (a)
rmitab or (b) edit the /etc/inittab and then do "init q", the
websphere process will NOT receive the terminating signal.
(1) we use
startNode.sh -script nxeStartNode to generate the shell
script named nxeStartNode
(2) Then we add the entry like the following in the /etc/inittab
e.g.
wsio01sv01:23:respawn: /websphere/AppServer/bin/nxeStartNode >
/dev/null 2&1
(3) This starts the process and if the process gets killed
restarts.
(4) Problem:
when we remove the entry from /etc/inittab using rmitab or edit
the /etc/inittab followed by "init q", the termination signal
only goes to the nxeStatNode and NOT to the java process
representing the nodeagent. (same is true if the whole thing was
to be done correspondingly for dmgr or appservers etc.)
(4a) In similar fashion, the solution provided by you also
suffers from the same problem. i.e. termination signal generated
by init will never reach the nodeagent (in this case in specific
and other processes such as dmgr, appServers in general).
(5) Solution:
In the generated script e.g. nxeStartNode, the last command
which lunches the final process i.e. nodeagent, needs to be
invoked by using "exec". This is the only established way of
putting a shell script in the /etc/inittab, known to me. So
please change the script generator code so as per above in
startNode.sh startmanager.sh and startServer.sh tools.
In our sample script rc.was also does not terminate the java
process. The termination signal goes up to script not to the
java process.
#!/bin/sh
#
# All Rights Reserved * Licensed Materials - Property of IBM
# US Government Users Restricted Rights - Use, duplication or
disclosure
# restricted by GSA ADP Schedule Contract with IBM Corp.
# 5639-D57, 5630-A36, 5630-A37, 5724-D18 (C) COPYRIGHT
International Business
# Machines Corp., 1997,2002
# IBM Confidential OCO Source Material
# The source code for this program is not published or otherwise
divested
# of its trade secrets, irrespective of what has been deposited
with the
# U.S. Copyright Office.
# This script is intended to be used in the inittab file as a
monitor
# for a WebSphere process. Typically, the Node Agent server
process will
# be monitored and the Deployment Manager process will be
monitored.
# Individual application server processes can be monitored using
a similar
# script and entry in inittab. An example inittab entry using
this
# script would look like the following:
#
# was:2:once:/usr/WebSphere/AppServer/bin/rc.was >/dev/console
2>&1
#
# *** IMPORTANT NOTE:
# IT IS ASSUMED THAT THE startServer.sh <server name> -script
# COMMAND HAS BEEN RUN FOR THE SERVER PROCESS TO BE MONITORED
AND THAT
# THE RESULTING start_<server name> SCRIPT IS CONFIGURED IN THIS
SCRIPT AS THE # # launchScript VARIABLE.
#
# The server launching script generated by startServer -script
is used by this
# monitoring script to launch the process. The user is expected
to edit
# this example and modify the value of the launchScript variable
below,
# in order to provide the exact WebSphere server process launch
# script file name for the process to be monitored.
#
# This script monitors the server process and checks the exit
code when the
# process terminates. If the exit code is not zero (considered
normal exit),
# this script will relaunch the server, up to a maximum number
of attempts
# (numRetries) at which point this monitor script terminates.
The user may
# modify the number of times a relaunch is attempted by editing
the value
# of the numRetries variable below.
#
launchScript=start_server1.sh
numRetries=3
binDir=`dirname $0`
# Set the ulimit
LIMIT=`ulimit -n`
if ■ "${LIMIT}" != "unlimited" 
then
if ■ $LIMIT -lt 1024 
then
ulimit -n 1024
fi
fi
RETRY=0
while ■ $RETRY -lt $numRetries 
do
echo launching server using $launchScript
$binDir/$launchScript
rc=$?
echo exit code: $rc
# Increment retry count on anything other than a normal exit
code
if ■ $rc -gt 0 
then
RETRY=`expr $RETRY + 1`
fi
case $rc in
0) break ;;
esac
done
exit 0
Local fix
In the generated script e.g. nxeStartNode, the last command
which lunches the final process i.e. nodeagent, needs to be
invoked by using "exec". This is the only established way of
putting a shell script in the /etc/inittab, known to me. So
please change the script generator code so as per above in
startNode.sh startmanager.sh and startServer.sh tools.
Problem summary
****************************************************************
* USERS AFFECTED: WebSphere Application Server users employing *
* inittab entries to control Wserver *
* processes. *
****************************************************************
* PROBLEM DESCRIPTION: When inittab entries are removed *
* the corresponding server process *
* is not signalled to die by the *
* operating system. As a result *
* users had to manually kill after *
* removing the inittab entries. *
* *
****************************************************************
* RECOMMENDATION: Manually kill the server process subsequent *
* to inittab entry removal. *
****************************************************************
Inittab entry removal did not signal the corresponding
server process.
Problem conclusion
The signalling to the server process was not happening
because of lack of invoking it through an exec call
in our server startup shell scripts.
Temporary fix
The testfix has been made available to customer through
our
pq99999 site, as well as mailed to L2 to forward
to customer.
Comments
A testfix has been made available to customer through
our
pq99999 site, as well as mailed to L2 to forward
to customer.
APAR information |
APAR number |
PQ80483 |
Reported component name |
WAS NETWRK DEPL |
Reported component ID |
5630A3601 |
Reported release |
00W |
Status |
CLOSED PER |
PE |
NoPE |
HIPER |
NoHIPER |
Special Attention |
NoSpecatt |
Submitted date |
2003-11-05 |
Closed date |
2004-02-18 |
Last modified date |
2006-04-03 |
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
Publications Referenced
Applicable component levels |
R00A PSY |
UP |
R00H PSY |
UP |
R00P PSY |
UP |
R00S PSY |
UP |
R10A PSY |
UP |
R10H PSY |
UP |
R10P PSY |
UP |
R10S PSY |
UP |
|