APAR status |
Closed as program error.
| Error description
Customer is seeing various problems associated with WebSphere
admin server including appserver failures requiring complete
restart of WebSphere when app servers receive a SIGTERM.
Problems are intermittent and there appears to be an interaction
between WebSphere native code library
(libWsProcessManagement.so) and a
native code library in use by customer's app servers. This may
be related to multi-threaded problems and/or timing windows in
WebSphere native code.
.
Symptom:
Problem with the adminserver not restarting the appserver in
two cases:Symptom:Problem with the adminserver not restarting the appserver in
.
1) If the jvm from the appserver received a SIGTERM (kill -15),
and
2) if a native library is loaded into the jvm and that native
library has a signal handle for SIGTERM and causes and exit of 0
as part of its signal handling the adminserver was not able
to restart the appserver.
But if the native library is not loaded and jvm receives a
sigterm the adminserver recycles the appserver successfully. two cases:.1) If the jvm from the appserver received a SIGTERM (kill -15),and2) if a native library is loaded into the jvm and that nativelibrary has a signal handle for SIGTERM and causes and exit of 0as part of its signal handling the adminserver was not ableto restart the appserver.But if the native library is not loaded and jvm receives asigterm the adminserver recycles the appserver successfully. Local fix
Provided customer with a modified version of the
libWsProcessManagement.so file that issues a SIGKILL instead of
a SIGTERM Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server. *
****************************************************************
* PROBLEM DESCRIPTION: WebSphere admin server fails to *
* terminate and restart a failing app *
* server. App server process remains. *
****************************************************************
* RECOMMENDATION: *
****************************************************************
When an app server is failing (i.e., fails to send a ping to
the admin server), the admin server responds by sending a
SIGTERM to the app server process. In the problem reported
here, a native code library in use by the app server was
intercepting the SIGTERM and not passing it on so that it
would lead to termination of the app server. Problem conclusion
Changed the admin server code to send SIGKILL (which cannot be
caught or ignored) rather than SIGTERM. Temporary fix
PQ70417 has been submitted on pq99999.raleigh.ibm.com Comments
APAR information | APAR number | PQ70417 | Reported component name | WEBSPHERE AE AI | Reported component ID | 5648C8402 | Reported release | 350 | Status | CLOSED PER | PE | NoPE | HIPER | NoHIPER | Submitted date | 2003-01-29 | Closed date | 2003-04-08 | Last modified date | 2003-05-21 |
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:APAR is sysrouted FROM one or more of the following:
PQ79091
Modules/Macros APAR is sysrouted TO one or more of the following:PQ79091Modules/Macros
Applicable component levels | R350 PSY | UP |
|