PQ70417: ADMINSERVER WILL NOT RESTART APPSERVER AFTER ISSUING A SIGTERM


APAR

APAR status
Closed as program error.

Error description
Customer is seeing various problems associated with WebSphere
admin server including appserver failures requiring complete
restart of WebSphere when app servers receive a SIGTERM.
Problems are intermittent and there appears to be an interaction
between WebSphere native code library
(libWsProcessManagement.so) and a
native code library in use by customer's app servers.   This may
be related to multi-threaded problems and/or timing windows in
WebSphere native code.
.
Symptom:
Problem with the adminserver not restarting the appserver in
two cases:Symptom:Problem with the adminserver not restarting the appserver in
. 1) If the jvm from the appserver received a SIGTERM (kill -15), and 2) if a native library is loaded into the jvm and that native library has a signal handle for SIGTERM and causes and exit of 0 as part of its signal handling the adminserver was not able to restart the appserver. But if the native library is not loaded and jvm receives a sigterm the adminserver recycles the appserver successfully.
two cases:.1) If the jvm from the appserver received a SIGTERM (kill -15),and2) if a native library is loaded into the jvm and that nativelibrary has a signal handle for SIGTERM and causes and exit of 0as part of its signal handling the adminserver was not ableto restart the appserver.But if the native library is not loaded and jvm receives asigterm the adminserver recycles the appserver successfully.
Local fix
Provided customer with a modified version of the
libWsProcessManagement.so file that issues a SIGKILL instead of
a SIGTERM
Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server.   *
****************************************************************
* PROBLEM DESCRIPTION: WebSphere admin server fails to         *
*                      terminate and restart a failing app     *
*                      server.  App server process remains.    *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
When an app server is failing (i.e., fails to send a ping to
the admin server), the admin server responds by sending a
SIGTERM to the app server process.  In the problem reported
here, a native code library in use by the app server was
intercepting the SIGTERM and not passing it on so that it
would lead to termination of the app server.
Problem conclusion
Changed the admin server code to send SIGKILL (which cannot be
caught or ignored) rather than SIGTERM.
Temporary fix
PQ70417 has been submitted on pq99999.raleigh.ibm.com
Comments
APAR information
APAR numberPQ70417
Reported component nameWEBSPHERE AE AI
Reported component ID5648C8402
Reported release350
StatusCLOSED PER
PENoPE
HIPERNoHIPER
Submitted date2003-01-29
Closed date2003-04-08
Last modified date2003-05-21

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:APAR is sysrouted FROM one or more of the following:

PQ79091

Modules/Macros
AdminSVR
APAR is sysrouted TO one or more of the following:PQ79091Modules/Macros

Fix information

Applicable component levels
R350 PSYUP











Document Information

Product categories: Software, Application Servers, Distributed Application & Web Servers, WebSphere Application Server, General
Software version: 350
Reference #: PQ70417
IBM Group: Software Group
Modified date: 2003-05-21