Troubleshooting Guide
If you are supporting an organization using DB2, you will receive calls
from users to resolve a variety of problems. Your response depends
on:
- The severity of the problem
- The specific nature of the problem
- Any related information that you can gather
- Your experience in resolving similar problems.
To solve a problem, start by obtaining a comprehensive description of the
problem. This way, you can begin to determine its origin. For
example, a problem may exist in any of the following:
- Hardware
- Operating system
- Networking system or other subsystem
- DB2 server
- DB2 client
- DB2 Connect gateway to host systems.
Most applications run in a client/server environment. You must
determine if a problem is on the client, the server, or somewhere in between
(that is, in the LAN or communication protocol stack).
Investigating where the problem is detected or reported is the best way to
start. For example, if you receive an unexpected SQL code on a client,
then investigate the SQL code on that client. (See Responding to Unexpected Messages or SQL Codes for information.)
Often the SQL code alone is enough information to determine the source and
cause of the problem. If the SQL code does not give enough information
to determine the source of a problem, examine the db2diag.log file at
the machine where the problem was reported. For example, if the problem
was reported on a client, first look at the db2diag.log file on that
particular client.
The db2diag.log file is an ASCII file written by DB2 that contains
diagnostic information for DB2. If you know the date and time when the
problem occurred, you can go directly to the corresponding db2diag.log
entries. For information on this important file, see First Failure Data Capture. When viewing the file, keep in mind that the most
recent conditions are always at the end.
When you receive an unexpected message or SQL code, follow these steps
until you can determine the problem:
- When you receive a message, take note of all available information,
including the following:
- The code, an 8-digit alphanumeric message identification
number. This code may begin with the prefix SQL, DBA or CCA.
Also note all reason codes, return codes, and other information associated
with the message returned.
-
Any SQL state received. SQL states are useful for diagnosing problems,
because they are consistent across all platforms. For a list of SQL
states, refer to the Message Reference.
- The text of the message (especially if the message does not include an
identification number or a code).
- The SQLCA if available.
- Any action suggested in the message.
- Diagnostic files, such as the db2diag.log file. In addition,
note any operating system diagnostic files such as core files (for UNIX-based
systems), event logs (for Windows NT), or SYSLOG files (for OS/2). For
information, see Part 2, Advanced DB2 Troubleshooting.
- The environment in which the message occurred. For example, what
the user was doing at the time, the steps that led up to the problem, the type
of operating system, applications that were running, and the communication
protocol.
- The SQL statement that encountered the error, and any preceding statements
in the unit of work
- Check the online message help by typing db2 "?
message" from the command prompt, where message is the
complete SQL code, SQL state, or message number. Read and follow the
suggested actions.
- Use the SQL code or message number to search available DB2 documentation
for additional information.
- If the problem persists, ensure that you have as much of the following
information as possible before contacting DB2 Customer Service:
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
In this book the term abend includes:
- Segmentation violations and general protection faults (GPFs) on Windows
systems
- Traps on OS/2
- Exceptions on UNIX-based systems
When an abend occurs, work through the following steps until you can
determine the problem:
- Confirm that all DB2 components are at the same service level, especially
if a Fix Pak has recently been installed. See Updating DB2 Products.
- Note the executable module that reported the abend.
- If the problem persists, try to collect the following additional
information before contacting DB2 Customer Service:
- Any logged information, in particular:
- If the problem can be reproduced, a trace on the client and server may be
helpful. Follow the steps in Example of Tracing to a File.
See Part 2, Advanced DB2 Troubleshooting for information.
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
When the system appears to be suspended or looping, try to identify the
problem by working through the following steps:
- Recover the system:
- If the operating system is suspended (with no sign of disk activity),
reboot the machine and check the db2diag.log file for problems.
- If you can access the operating system but not the application:
- Check the status of applications with the Control Center or the LIST
APPLICATIONS FOR DATABASE database-alias command.
The status information shows if applications are waiting for a lock or for
user input ("UOW Waiting"), rather than being suspended inside of the database
manager.
- Use a CPU monitor to check for applications that are using large amounts
of CPU time, and then use your judgement to determine whether or not the
applications are suspended or behaving as expected.
- Check the db2diag.log file for DB2 problems.
- On UNIX-based environments, work through the following steps until
you can stop your DB2 instance:
- Stop the DB2 instance normally with db2stop
- Stop the DB2 instance and force any remaining applications with
db2stop force
Work through the following steps as a last resort, only if the
above steps did not stop the DB2 instance:
- Abruptly kill the DB2 instance with db2stop -kill
- Use the kill command to terminate any DB2 agents that cannot be
stopped
- Use the kill command to terminate DB2 itself (db2sysc)
- As a very last resort, reboot your entire system
If you must use the kill command, ensure that all DB2
interprocess communications (IPC) resources are removed. Either:
- Using messages, the db2diag.log file, and other information,
attempt to determine why the suspension or loop occurred.
Some common problems that cause suspensions or loops can include the
following:
- If the problem persists, try to collect the following additional
information before contacting DB2 Customer Service:
- Any information logged by DB2. See "First Failure Data Capture".
- If the system is suspended:
- Set up a DB2 trace to dump output to a file. Follow the
instructions in Example of Tracing to a File.
- For UNIX-based systems, get a stack traceback for the
application. Stack tracebacks provide information on the system calls
for each process ID up to the point of the suspension.
- For AIX, issue kill -36 against the db2sysc
process.
- For HP-UX, issue kill -29 against the db2sysc
process.
- For Solaris Operating Environment, issue kill -21
against the db2sysc process.
For information, see Gathering Stack Traceback Information on UNIX-Based Systems.
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
[ DB2 List of Books |
Search the DB2 Books ]