APAR status
Closed as program error.
Error description
Problem :
nodeagent (java process) suddenly went down.
Recreation Procedure : (same with related PMR .. 33890/43890)
1. Start cluster members with wsadmin.sh
2. Check startup completed .. and Wait 60 seconds.
3. Stop cluster members with wsadmin.sh
4. Check stop completed .. and Wait 30 seconds
5. back to step 1.
nodeagent suddenly went down while doing above procedure.
The core is geneated at that time.
This problem occured with
MALLOCTYPE=debug
MALLOCDEBUG=allow_overreading
jcore showed following..
==========================================================
SIGSEGV raised in libWs50ProcessManagement.so
Local fix
no
Problem summary
****************************************************************
* USERS AFFECTED: WebSphere Application Server users using 5.0 *
****************************************************************
* PROBLEM DESCRIPTION: Nodeagent went down while doing start *
* stop of cluster members repeatedly. *
****************************************************************
* RECOMMENDATION: *
****************************************************************
This is caused by a timing window/race condition in admin code
that handles process management and monitoring. If a server
is shut down in that window, it is possible for memory to be
free'ed while in use, thus resulting in a crash of the
nodeagent. The window only exists for the first twenty
minutes after a server is started. After that time, the
window will close and the server can be stopped safely.
Problem conclusion
The code has been cleaned up to protect against inadvertent
freeing of memory while it is in use in another thread. All
native level deallocations in the process management code have
been moved to object finalization (instead of being initiated
by the programmer through one of the apis). Moving to
finalization of the object uses the JVM guarantee that no
running thread has access to, or is acting on, the process
monitoring object, thus protecting from invalid accesses.
Temporary fix Comments
APAR information |
APAR number |
PQ81567 |
Reported component name |
WAS NETWRK DEPL |
Reported component ID |
5630A3601 |
Reported release |
00A |
Status |
CLOSED PER |
PE |
NoPE |
HIPER |
NoHIPER |
Special Attention |
NoSpecatt |
Submitted date |
2003-12-02 |
Closed date |
2004-01-07 |
Last modified date |
2004-01-07 |
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
Publications Referenced
Applicable component levels |
R003 PSY |
UP |
R00A PSY |
UP |
R00H PSY |
UP |
R00I PSY |
UP |
R00P PSY |
UP |
R00S PSY |
UP |
R00W PSY |
UP |
R103 PSY |
UP |
R10A PSY |
UP |
R10H PSY |
UP |
R10I PSY |
UP |
R10P PSY |
UP |
R10S PSY |
UP |
R10W PSY |
UP |
|