This MustGather covers the scenario where you have issued
the stop command or tried to stop a server via the Administrative Console
but the server you are requesting to stop is not coming down or hangs.
There is a separate MustGather for the scenario when an Application Server
is successfully processing requests and all of a sudden hangs. That
MustGather is titled "MustGather: A hang occurs when running WebSphere
Application Server for z/OS® which was previously processing requests".
MustGather information for the specific problem when a hang occurs
trying to stop an Application Server or Deployment Manager:
If you have not contacted support click: MustGather:
Read first for WebSphere Application Server for z/OS link.
During stop processing, the WebSphere Application Server for z/OS runtime
will wait for all active requests to complete processing and then attempt
to stop each thread. In almost all cases, when the stop command is not
working, it is because
a) there is still at least one request that is actively processing
or
b) all requests have finished processing but some other product or
application has created a non-daemon type thread in the Java™ space, in
scenarios when trying to stop an Application Server.
or
c) notification communication with the Deployment Manager is stuck, in
scenarios where you are trying to stop the Deployment Manager.
This MustGather information addresses situation "b)" and "c)" above.
However, you can collect the same trace and console dump to diagnose
situation "a)" above to see what request is still dispatched.
For scenario "b)" above, there are 2 types of Java threads, daemon and
non-daemon. The daemon type threads are interruptible and can thus be
taken down when WebSphere Application Server for z/OS runtime issues
interrupt when trying to stop the server. The non-daemon type threads are
not interruptible if the code running on the thread catches the
interruption thrown, does nothing with it, and continues to wait. These
kinds of non-daemon type threads prevent the WebSphere Application Server
for z/OS runtime from cleanly shutting the JVM™ down because the JVM will
wait until all threads in the Java space have been cleaned up in order for
the JVM to end. Thus the server will wait forever for these non-daemon
type threads to come down and the stop processing will hang.
For scenario "b)" above, with version 5 service levels W502020 (and above)
and W510204 (and above) and all version 6 Fix Packs, some improved
diagnostics were added into WebSphere Application Server for z/OS runtime
method threadTerm in CommonBridge.java for identifying any non-daemon
threads during stop processing. This MustGather explains how to enable
this trace. Even if you are running with service levels lower than W502020
or W510204, you will not have the additional trace diagnostic, but the
same trace and console dump needs to be collected.
- Obtain the trace then attempt to stop the server
Ensure that the Trace output is being written to sysprint using this
MVS modify command:
f controller_region_name,tracetosysprint=yes
Note: When setting the trace, case is important in the tracejava
keyword. If using SDSF to enter commands, use the SDSF extended command
line (by putting a slash, / , on the command line and press enter);
otherwise SDSF makes the string uppercase.
If an Application Server is not responding to the stop command, issue MVS
modify commands to enable this trace:
f controller_region_name,tracejava='com.ibm.ws390.orb.*=all=enabled'
f controller_region_name,tracedetail=(3,4)
If the Deployment Manager is not responding to the stop command, issue MVS
modify commands to enable this trace:
f
deployment_manager_controller_region_name,tracejava='com.ibm.*=all=enabled'
f node_agent_server_name,tracejava='com.ibm.*=all=enabled'
Once the trace is enabled, issue the stop to the Controller Region via the
MVS modify command, via the Administrative Console, or Scripting command.
Use the same method of stop you were using when you first observed the
hang.
Make sure you see the stop command issued for the Controller Region and
the server acknowledged it. You'll see the following message in the job
output of the Controller Region:
"BBOO0133I WEBSPHERE FOR Z/OS STOP COMMAND ISSUED FOR SERVER..."
Let the trace run for 30 seconds.
Issue the following MVS modify command to turn off the trace:
For Application Server:
f controller_region_name,traceinit
For Deployment Manager and Node Agent:
f deployment_manager_controller_region_name,traceinit
f node_agent_server_name,traceinit
- Dump hanging address spaces
If an Application Server is not responding to the stop command, take
a console dump of the WebSphere Application Server Controller and
Servant(s) that will not come down from the stop command:
DUMP COMM=(Descriptive name for this WebSphere dump)
R rn,SDATA=(ALLNUC,CSA,GRSQ,LPA,LSQA,PSA,RGN,SQA,SUM,SWA,TRT),CONT
R rn,JOBNAME=(controller_region_name, servant_region_name),END
If it is the Deployment Manager that will not come down from the stop
command, ensure the Deployment Manager's Controller and Servant Regions
are in the console dump, as well as the Node Agent, as well as OMVS:
DUMP COMM=(Descriptive name for this WebSphere dump)
R rn,SDATA=(ALLNUC,CSA,GRSQ,LPA,LSQA,PSA,RGN,SQA,SUM,SWA,TRT),CONT
R rn,JOBNAME=(OMVS,dmgr_controller_region_name, dmgr_servant_region_name,
node_agent_server_name),CONT
R rn,DSPNAME=('OMVS'.*),END
Ensure that any console dumps are not partial dumps. Verify that you see
message IEA611I in the SYSLOG. This message indicates that a Complete dump
was taken.
- Send documentation
FTP the Control and Servant Region joblogs.
FTP the tersed console dump(s) in binary format.
Review the information in the following link before sending documentation
to IBM:
Submitting Diagnostic Information to IBM.
In the scenario where an Application Server is not responding to a stop
command because of a non-daemon type thread, the trace will indicate which
thread is the non-daemon one. The console dump is needed to get the
callback stack of the thread to understand which product or application
owns this thread. The following information explains how to find the
thread and match it to a callback stack. You can do these diagnostic steps
yourself or send the documentation in to IBM for support to assist you
with the steps or go through the steps for you.
In the Servant Region trace output, for each thread, when the thread is
being stopped by the WebSphere Application Server for z/OS runtime, you
will see a trace that contains:
"CommonBridge.threadTerm, interrupting thread: ... is a Daemon thread"
or
"CommonBridge.threadTerm, interrupting thread: ... is not a Daemon thread"
Please note that one TCB will be issuing all these trace entries but the
individual thread identifier is contained within a trace entry.
The ones to be concerned with are the entries that have:
"... is not a Daemon thread"
The trace will only show the Java thread identifier ("Thread-11" in
example below), not a TCB. We need to use the console dump and jformat DIS
LS to look under the "Thread Identifiers" section to map the Java thread
identifier into a TCB. The jformat tool is a part of svcdump.jar which can
be found
http://www-1.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty2.html
and is called "SVC Analyzer".
The following trace is an example showing a non-daemon type thread. The
trace identifies "Thread-11"
Trace: 2005/06/02 17:04:39.273 01 t=9E2E88 c=UNK key=P8 (13007002)
FunctionName: com.ibm.ws390.orb.CommonBridge
SourceId: com.ibm.ws390.orb.CommonBridge
Category: DEBUG
ExtendedMessage: CommonBridge.threadTerm, interrupting thread: Thread-11,
thread is not a Daemon thread
We then need to run jformat on the console dump to get the TCB associated
with the thread identifier or thread name, i.e. "Thread-11". The jformat
command is: jformat DIS LS . Then in the output, look under the "Thread
Identifiers" section to see the thread name mapped to a TCB. Match the TCB
to the callback stack in the Console dump by either looking at the thread
callback stack output of svcdump.jar or from in IPCS, IP VERBX LEDATA
'asid(aaaa) NTHREADS(*)' where aaaa is the asid in hex of the hung Servant
Region.
For a listing of all technotes search the WebSphere Application Server for z/OS support
site.
|