Troubleshooting Guide
This section describes how to deal with some frequently encountered
problems faced by users working with a DB2 server. The following topics
are presented in this section:
For additional troubleshooting information on this topic, see:
Important: This section represents a small sampling of the
information available from DB2 Customer Service. For a complete and
up-to-date source of DB2 information, use the DB2 Product and Service
Technical Library at
http://www.software.ibm.com/data/db2/library/.
If you experience problems migrating to Version 6, ensure that:
- [ ]
- You followed the instructions in the Quick
Beginnings guide.
- [ ]
- You checked the list of incompatibilities from previous versions.
(See the Administration Guide.)
- [ ]
- The database is consistent.
To verify that databases are consistent so that they can be migrated, use the
db2ckmig command. (Refer to the Quick
Beginnings book for more details.)
You must correct all errors that are reported for databases that are in any
of the following states:
- Backup pending: Perform a backup of the database.
- Rollforward pending: Recover the database as required by performing
or resuming the db2 ROLLFORWARD DATABASE command.
- Database inconsistent: Restart the database to make it
consistent.
Additionally, this tool identifies databases that contain objects that use
SYSCAT, SYSSTAT, or SYSFUN as schema names. These objects must be
dropped and re-created using a different schema name.
- [ ]
- The disk space on the machine is adequate. (The amount of disk
space required varies with each database.)
- [ ]
- You use appropriate values for the number of primary and secondary log
files (logprimary and logsecond in database
configuration), and for the size of the log files (logfilsiz in
database configuration). See Server Failures for details.
- Symptom
- The DB2 server does not respond to requests or provides unexpected
responses.
- Action
- Ensure that:
- [ ]
- You check the db2diag.log file on the server to diagnose a
cause. For information, see Interpreting the db2diag.log file.
- [ ]
- DB2COMM is set to the protocols required by connecting
clients. (If not, the message SQL5043N is usually received.)
Consult the db2diag.log file to see which communication protocols
were not started successfully, and why. For information, see Using the db2diag.log file to diagnose server communication problems.
- [ ]
- There is enough log space. (If not, the SQL0984C message is
typically received.)
To increase log space:
- Increase the number of primary and secondary log files
(logprimary and logsecond in database configuration)
-
Increase the size of the log files (logfilsiz in database
configuration)
To initiate your changes, you must disconnect all applications from the
database and/or deactivate the database.
You must ensure that the file system or disk space that contains the logs
is big enough to hold all of the logs. To calculate the minimum
required space for initializing the logs, use the following formula:
(logprimary + logsecondary) * logfilsiz * 4096
- [ ]
- You are using the latest version of DB2, with appropriate fix packs
installed. For information, see Updating DB2 Products.
- [ ]
- The index files are valid.
The db2diag.log file indicates if indexes need to be
re-created. Use db2dart /MI to mark the indexes as
inconsistent. See Miscellaneous Troubleshooting Tools for details on db2dart /MI.
To delay the rebuilding of indexes until after the database is started, use
the INDEXREC configuration parameter as documented in the Administration Guide, Performance manual.
- [ ]
- The data is consistent.
Look for the following symptoms:
- There is a DIA3700C error in the db2diag.log file.
- The server goes down when particular data is accessed. This data
resource may be damaged.
- The server fails repeatedly.
Run db2dart to verify that the data is consistent. If
data is not consistent, contact DB2 Customer Service.
For DB2 Universal Database Enterprise - Extended Edition for the Solaris
Operating Environment and DB2 Universal Database Enterprise - Extended Edition
for Windows NT, see also Database Manager Will Not Start or Hangs.
- Symptom
- You receive the error message SQL1403N when attempting to get
authorization for usernames or groups on a Windows NT Master Domain running
DB2.
- Possible Cause
- The authorization restrictions on the Windows NT Master Domain.
- Action
- Ensure the service for the instance is being started under a local
administrator who is not a user from the master domain; and who has
operating system rights on the local machine.
- Symptom
- You have problems installing or administering a database server.
- Action
- Ensure that:
- [ ]
- You are using a valid user name and password that have administrator
authority.
- For OS/2 Warp Connect and later versions of OS/2: The default DB2
administrator user name and password are the same as those specified when the
operating system was installed.
- For versions of OS/2 before OS/2 Warp Connect: The default
administrator user name and password depend on the national language installed
on a machine, and on whether UPM was installed with a particular user
name. In most cases, the defaults are USERID and PASSWORD in English
countries. For more information, refer to the Quick
Beginnings for your platform.
- For Windows NT: Your membership in a group controls which actions
you are permitted to perform. For DB2 for Windows NT, you must belong
to the "Administrators" or "Domain Admins." groups to perform
installation and some administration tasks.
You can find out what your group is by using the User Manager tool to
display user names and group memberships. To invoke this tool, select
Start, then Programs, then Administrative
Tools, and then User Manager.
To install DB2, you must have administrator authority on the local machine
with a user name that is eight characters or less (note that you cannot use
the user name "Administrator"). To be added to the administrator group,
you must be added by an existing member of that group.
- For UNIX-based environments: You must belong to a SYSADM
group. The name of the group must be eight characters or less in
length.
-
For DB2 for Solaris: If a DB2 instance is started by a user who does not
belong to the same group as the DB2 instance owner, the SQL1042C message is
received for many commands. In this situation, the executing db2sysc
process inherits the user's group name and does not have the correct read
permission to open a file in the /proc directory.
You must start the instance as the instance owner, and the id/group of the
db2sysc process attempting to read a file must be the same as the id/group of
the DB2 instance.
- [ ]
- For UNIX-based environments other than AIX, you update and rebuild the
kernel configuration parameters. (If not, the SQL1016N or SQL1018N
messages will typically occur when you try to create an instance or use the
command line processor.) Refer to the Quick
Beginnings guide for your platform for details.
It is important to have a good backup and recovery strategy, and a plan to
implement this strategy. This plan should address questions such as the
following:
- How crucial is the data?
- How long can the data be unavailable without impacting users?
- How much time and resource can be spent on restoring the database?
- Is the data read-only, or is it updated?
- Can the data be rebuilt easily from another source?
- How much resource is allocated to a backup and recovery system?
Backup plans are heavily influenced by the importance of having the data in
the database available for use. If your business depends heavily on the
data, then you must develop a backup and recovery plan that will minimize
downtime and ensure that the data is available when needed at a cost that you
are willing to accept.
For help on backup and recovery refer to the Data Movement Utilities Guide and Reference and use the Backup Database SmartGuide from the Control Center.
- Symptom
- The SQL0902C message is received when a database is backed up.
- Possible Cause
- The database is not architecturally consistent.
- Action
- Use the db2dart command to verify the architectural consistency
of the database. See Miscellaneous Troubleshooting Tools for information on this command.
Usually you can perform a restore with a roll forward to the end of the
logs to correct the problem. If the problem persists, contact DB2
Customer Service.
- Symptom
- The error message SQL4908N is received when running a ROLLFORWARD
TABLESPACE.
- Possible Cause
- If starting the first attempt to roll forward a table space, the message
may be received when one or more of the table spaces specified to be rolled
forward is not in a ROLLFORWARD PENDING state on the specified nodes.
If continuing to roll forward a table space that is already in progress, the
message may be received when one or more of the table spaces specified to be
rolled forward is not in a "roll forward in progress" state on the
specified nodes.
- Action
-
- Use the LIST TABLESPACES SHOW DETAIL command on the nodes
specified to find out which table spaces are not ready to be rolled
forward.
- Use the QUERY STATUS option of the ROLLFORWARD command to
determine the status of the table space roll forward.
-
- If starting a new table space roll forward, put the table spaces into
ROLLFORWARD PENDING state by restoring them.
- If continuing a table space rollforward and one or more of the table
spaces involved has been restored and put into ROLLFORWARD PENDING state, the
table space roll forward in progress must be cancelled.
- Put the table spaces into the RESTORE PENDING state.
- Restore the table spaces.
- Submit the original ROLLFORWARD command again.
- Symptom
- The error message SQL2419N is received when taking an online backup to
disk.
- Possible Cause
- Although the disk was not full, the backup file was greater than the 2 GB
operating system file size limit.
- Action
- You can do a backup and specify multiple target directories. The
backup directories divide the file into sizes smaller than the 2 GB operating
system file size limit.
- Symptom
- NetBIOS connections from DB2 clients are lost when backing up using ADSTAR
Distributed Storage Manager (ADSM).
- Possible Cause
- As part of ADSM's initialization, a NCB.RESET is issued, which
resets the NetBIOS resources of the calling application. In the case of
DB2 for OS/2 and Windows NT, the calling application is DB2
itself. This results in all connections to the database made through
NetBIOS to be dropped.
- Action
- There is no way to prevent this from happening when using NetBIOS as the
communication protocol. Use another protocol like TCP/IP when
using ADSM to backup the database.
- Symptom
- Data cannot be restored.
- Action
- Ensure that:
- [ ]
- You have enough disk space to restore the data.
Use redirected restore to specify a modified list of containers for the
table spaces being restored. Refer to the Administration Guide.
- [ ]
- You specify the correct path to the backup image and logs. (It is
possible that they were moved.)
- [ ]
- If an online backup was taken, all logs from the beginning to the end of
the backup are required to restore and roll forward the database.
(The rollforward must proceed until at least the point where the online
backup ended. Otherwise, the database is not accessible.)
- Symptom
- On AIX, the error message SQL0902C with return code 22 is received when
restoring a database using an API.
- Possible Cause
- Abnormal termination of applications has left the message queue connected
to an existing database file.
- Action
- Use the ipcrm AIX command to terminate all back-end processes
and then drop the existing database. Retry the RESTORE again.
Contact DB2 Support if the problem persists.
- Symptom
- After issuing a RESTORE command specifying that no roll forward is
required on a Windows NT machine, the table spaces are left in a ROLLFOWARD
PENDING state.
- Possible Cause
- Windows NT is having a resource conflict with DB2 where the hard file
cache limit of 192 MB is reached. This hard file cache is used whenever
DB2 opens or closes many files, or opens or closes large files.
- Action
- Ensure the DB2 product is updated to the latest FixPak, and that the
DB2NTNOCACHE registry variable is set to one. The
DB2NTNOCACHE setting can be checked using the db2set
-all command.
- Symptom
- The error message SQL1277N is received during a redirected restore to a
new database.
- Possible Cause
- One or more containers used in the redirected restore may already be in
use. Check the db2diag.log for additional
information.
- Action
- Remove untag the containers in use and then perform the redirected restore
using the Control Center. If you prefer to use the command line
processor, you must use the SET TABLESPACE CONTAINERS API and then
re-issue the RESTORE command with the "CONTINUE" parameter.
To find out more about the SET TABLESPACE CONTAINERS API, type
db2 ? set tablespace at the command line.
- Symptom
- The error message SQL3508N is received during a LOAD operation.
- Possible Causes
-
- Issuing the LOAD command without explicitly specifying the "REMOTE
FILE" parameter before attempting a LOAD RESTART operation causes the
utility to overwrite the default remote file.
- The specified file names used are similar even though the specified paths
are different. For example: "MESSAGES
C:\table.MSG" and "REMOTE FILE
D:\table".
- Action
-
- Restore the database.
- Specify different file names for each of the "MESSAGES" and "REMOTE
FILE" parameters. For example: "MESSAGES
C:\table.MSG" and "REMOTE FILE
D:\table.RMT". The "RMT" extension solves
the problem.
- Symptom
- Data will not load on a database server.
- Action
- Ensure that:
- [ ]
- You used the RESTART or REPLACE parameters with the LOAD command.
(If not, the SQL3805N message is typically received.)
For more information on the LOAD command, refer to the Data Movement Utilities Guide and Reference.
- [ ]
-
The size of a database object in an SMS table space has not reached an
operating system limit.
For example, OS/2, Windows NT, and Version 3.2 of AIX have a
2 GB file size limit.
(Refer to Problems with LOAD and with Splitting and Loading Data for more information on loading data in DB2 Universal
Database Enterprise - Extended Edition.)
- Symptom
- You are having problems importing data.
- Action
-
- [ ]
-
Ensure that there is enough log space available. (See Server Failures.)
- [ ]
- Use the COMMITCOUNT n option to commit data after every
n records are imported. This option protects the committed
data from loss in case of a failure. It also reduces the log
requirements for an import operation, allowing a number of smaller
transactions rather than one long transaction.
- Symptom
- You cannot use commands, utilities, or the Command Line Processor.
- Action
- Ensure that:
- [ ]
- You bound the utilities and your applications to the database after you
installed a fix or newer executable.
(The SQL0818N or SQL0805N messages indicate that you need to rebind.)
When you precompile an SQL application, a compilable file is produced, and
optionally a bind file. Both of these files have a new
timestamp. The default behavior of the precompile operation is to
automatically create a package so that you do not need to bind
anything. However, if you create a bind file but not a package, then
you must bind the new bind file to the database.
For information on binding, see your Quick Beginnings guide.
- [ ]
-
You used the correct syntax (especially for UNIX-based systems).
It is recommended that you either:
- Use double quotation marks (" ") around a command line processor
request.
- Precede any special characters such as asterisks (*), brackets, or
question marks (?) with a back slash (\) character to ensure that the command
line processor interprets them correctly.
Example
For:
db2 SELECT * FROM SYSCAT.TABLES
Use either:
db2 "SELECT * FROM SYSCAT.TABLES"
db2 SELECT \* FROM SYSCAT.TABLES
The database manager provides concurrency control and prevents uncontrolled
access by means of locks. Because of the basic principles of DB2
locking, you do not need to take action to control locks in most cases.
For information on how locking works, refer to the Administration Guide.
Use the following methods to investigate whether or not
applications are creating deadlocks or holding locks:
- Deadlocks occur when two or more applications connected to the
same database wait indefinitely for a resource. The waiting is never
resolved because each application is holding a resource that the other needs
to continue.
Use an event monitor for deadlocks on a database to keep a trace of each
deadlock as it occurs.
- To determine if an application is waiting for a lock, use the
snapshot monitor and see if lock_wait_time is high. A high value could
indicate that an application is waiting for a lock that another application is
holding. It indicates that applications may not be committing their
transactions frequently enough.
To see which locks are causing waits, and which applications are holding
them, set the LOCK monitor switch ON and then get an application
snapshot.
For information, see The Database System Monitor and the System Monitor Guide and
Reference.
- Symptom
- The SQL0911N message is received.
- Possible Cause
- There are deadlocks or time-outs.
- Action
- Check the reason code in the SQLCA to determine if deadlocks or time-outs
are causing the problem (see Interpreting an SQLCA Structure).
In the case of deadlocks, create an event monitor for deadlocks to gather
more information. Possibly increase the size of the lock list to avoid
lock escalation which can often result in a deadlock scenario.
Note that time-outs are not captured using the deadlock event
monitor.
- Symptom
- Applications accessing a DB2 database are slow or appear to hang.
- Possible Cause
- There is lock contention or lock escalation.
- Action
- Use the Control Center or the LIST APPLICATIONS FOR DATABASE
database-alias SHOW DETAIL command to find out if applications
are waiting on a lock and who is holding the lock that the applications are
waiting for. Then use the database monitor to determine if lock
escalation is occurring.
If escalation is occurring, ensure that:
- [ ]
- Applications use an appropriate frequency of commits.
Take a monitor snapshot for locks to determine which applications are
causing other applications to wait for locks. Also, monitor the
db2.lock_waits performance variable.
- [ ]
- The values of the database configuration parameters maxlocks
and locklist, which determine the number of locks allowed, are
appropriate.
- [ ]
- The locktimeout database configuration parameter is set
appropriately. When it is turned on, this parameter can help avoid
hangs due to lock contention, and can tell you if lock contention is your
problem.
- [ ]
- The dlchktime database configuration parameter is set
appropriately. This parameter defines the frequency at which the
database manager checks for deadlocks among all the applications connected to
a database. If it is set high, CPU time is saved but deadlocks may not
be detected soon enough.
- [ ]
- Other users are not preparing and binding applications.
The procedure of preparing and binding applications requires that locks be
obtained on system catalog tables. This procedure should be performed
in off-peak hours.
- [ ]
- Lock waits or deadlocks are not caused by next key locking.
Next key locking guarantees a cursor stability (CS) isolation
level by automatically locking the next key for all INSERT and DELETE
statements and the next higher key value above the result set for SELECT
statements. It is required to guarantee ANSII and SQL 92 standard CS,
and is the DB2 default.
Examine explain information for the application. If the problem
appears to be with next key locking, you can change the isolation level to
Read Stability (RS) by setting the DB2_RR_TO_RS flag on.
The DB2_RR_TO_RS flag stops all next key locking on user tables
(catalog tables are not affected). Any packages bound with CS are
automatically downgraded to RS, because DB2 can no longer guarantee CS.
Do not use this flag if you require ANSII and SQL standard CS,
For information on locking, configuration parameters, and explain
information, refer to the Administration Guide,
Performance.
Database performance is a complex subject and its details extend beyond the
scope of this guide. As a starting point, be aware of the
following:
- The Performance Configuration SmartGuide provides a first attempt at
optimizing a database. It asks you qualitative questions about the
database, its data, and its purpose, and then suggests optimal configuration
parameters. To invoke this SmartGuide, click with the right mouse
button on the database's icon in the Control Center, and then select
Configure performance from the pop-up menu.
- The reorgchk command returns information about the physical
characteristics of a table, and whether or not it would be beneficial to
reorganize that table.
-
If you feel your applications are running slow, you should ensure that the
database statistics are up to date. The runstats command
updates catalog statistics, thereby improving the potential for query
performance. It is accessible from the command line or from the Control
Center. Rebind all static applications after running it.
- The Index SmartGuide (or SQL Advise Facility) can be used with a specific
query to create an index that may improve performance. Refer to the
Administration Guide, Performance manual for details on how to use and how to access the SQL Advise
Facility.
- To understand the access plan chosen for a particular SQL statement, use
Visual Explain from the Control Center. This tool accesses the
integrated Explain facility, a component of the SQL compiler that captures
information about the environment in which static and dynamic SQL statements
are compiled.
This information can help you:
- Understand the way in which a query executes
- Design application programs and databases
- Determine when an application should be rebound
Visual Explain provides an accurate, comprehensive, and easy-to-understand
profile of access plans.
- db2expln describes the access plan, selected for static SQL
statements in the packages that are stored in the system catalog
tables. It can render a quick explanation of an access plan for
packages in which Explain data had not been captured at bind time.
- dynexpln converts a dynamic statement into a static statement
and is then interpreted through db2expln. In other words, it
creates a static package for the statements and in turn employs the
db2expln tool to provide a description of those statements.
Although this method is quick to interpret basic access plans for dynamic
statements, the results may be misleading.
- db2exfmt provides a textual description of the explain tables
and the operators chosen for access plans of static SQL.
You can use these Explain tools to understand the access plan chosen for a
particular SQL statement. Or, you could use the integrated Explain
Facility in conjunction with Visual Explain to understand the access plan
chosen for a particular SQL statement. Both dynamic and static SQL
statements can be explained using the Explain Facility. A difference
among Explain tools is that the Explain information in Visual Explainis
provided in a graphical format. Besides this, however, the level of
detail provided in either is identical.
To fully use the output of db2expln and dynexpln you
must understand:
- The different SQL statements supported and the terminology related to
those statements (such as predicates contained in a SELECT statement).
For information on the Explain facility and on db2expln, refer
to the Administration Guide, Performance.
- Configuration parameters affect performance. For example,
sheapthres, the database manager configuration parameter for sort
heap threshold, can affect performance if it is set too low (see Improving Database Performance). It should also be at least twice as big as
sortheap, the database configuration parameter for sort
heap. For information, refer to the Administration
Guide, Performance.
- Locking and how it is handled may affect performance of
applications. See Applications slow or appear to hang.
- The various monitoring tools available for your operating system will help
you identify performance problems with your DB2 server and with your network
connections:
- For OS/2, use SPM/2
- For Windows NT, use the Windows NT performance monitor
- For UNIX-based environments, use the various monitoring commands outlined
in Performance Monitoring Tools.
For more information on performance:
- Search for the keyword "performance" in the DB2 Product and Service
Technical Library at
http://www.software.ibm.com/data/db2/library/.
- Refer to the Administration Guide, Performance, and the System Monitor Guide and Reference.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
[ DB2 List of Books |
Search the DB2 Books ]