Steps to help resolve Connection Pooling problems
- Are you having a problem using a data source to establish a JDBC
connection to a database? Most exceptions that occur when a data source is
used to connect to a database will have the com.ibm.ws.rsadapter
package in the stack trace of the exception. This includes
java.sql.SQLExceptions and WebSphere Application Server messages
that begin with DSRA.
- Are you having a problem using a JMS connection factory to establish a
JMS connection to WebSphere MQ or another messaging system? Most
exceptions that occur when a connection factory is used to connect to a
messaging system will have the com.ibm.ejs.jms package in the
stack trace of the exception. This includes
javax.jms.JMSExceptions and WebSphere Application Server messages
that begin with WMSG.
- Are you seeing ConnectionWaitTimeoutExceptions or slow
performance in getting a connection?
- Are you seeing StaleConnectionExceptions or having problems
recovering from invalid, or stale, connections in the pool?
- Go to the Miscellaneous
connection pooling problems section. If your problem is not covered in
that section or addressed elsewhere in this document, review the WebSphere
Application Server V5.0,
V5.1,
or V6.0
Information Center, or the WebSphere Application Server Support
site for additional information that might help you to resolve the
problem, or continue to the MustGather
for
database connection and connection pooling problems.
Troubleshooting JDBC connection
problems
- This section will help you to troubleshoot problems that occur when
using a data source defined in WebSphere Application Server to establish a
JDBC connection to a database. JDBC connection problems usually occur at
runtime when an application attempts to connect to the database and
perform SQL queries. This section does not include
ConnectionWaitTimeoutExceptions or
StaleConnectionExceptions, which are covered in subsequent
sections.
To troubleshoot this type of problem, you should have access to the
administrative console and the SystemOut.log for your application
server. When diagnosing a database connection problem, the first step is
to use the Test Connection button for the data source in the
administrative console to test the connection to the database. You can
find the Test Connection button in the data source configuration
panel.
- Was the attempt to connect to the database using the Test
Connection button successful?
- Yes, continue to question 3.
- No, continue to question 4.
- If you can use the Test Connection button to successfully
connect to the database, but a failure occurs when your application tries
to get a connection, the problem is likely caused by a failed JNDI lookup.
Check the SystemOut.log to see if a
NameNotFoundException occurs when the application tries to use
the data source. Does a NameNotFoundException occur?
- Yes, the root cause of the problem is that the JNDI
lookup of the data source by the application fails. Check the data source
JNDI name and ensure that it is bound to the JNDI namespace when the
application server starts. If the JNDI name is jdbc/ds, you
should see this entry in the SystemOut.log:
WSVR0049I: Binding ds as jdbc/ds
Also, check the application code and ensure that it is looking up the
correct JNDI name. If you are doing an indirect JNDI lookup (i.e.
java:comp/env/jdbc/ds), ensure that the binding is correct in the
data source resource reference. The resource reference is configured in
the web, EJB, or application client deployment descriptor. Checking for
these things can help you to resolve many JNDI problems. However, if you
can't determine the cause of the problem, continue to the JNDI TroubleShooting
document.
- No, continue to question 8.
- The error message shown in the administrative console when the Test
Connection button fails is important to understanding the cause of the
problem. You can also check the SystemOut.log to get more
information about the error and see the stack trace.
Does the error message and the accompanying SQLException indicate
a problem with the user ID and password that are used to connect to the
database? The actual error and exception will differ depending on the JDBC
driver and database that you're using.
- Yes, the root cause of the problem is either that
no userid and password are passed to the database, or the userid or
password is not correct. Ensure that a J2C
authentication alias, containing the correct userid and password, is
specified on the data source. Also check with your DBA to make sure that
the userid and password that you are using are correct for connecting to
the database.
- No, continue to question 5.
- If you are using a Type 2 driver or Type 3 JDBC driver, does a
java.lang.UnsatisfiedLinkError occur when you use the Test
Connection button? A java.lang.UnsatisfiedLinkError occurs when
the JVM is unable to load a native library that is needed by the JDBC
driver.
- Yes, the root cause of the problem is that the
application server JVM is not properly configured to load the native
libraries. Follow these instructions
if you are trying to connect to a DB2® database. If you are trying to
connect to an Oracle database, ensure that the ORACLE_HOME and
the LIBPATH (on AIX®), LD_LIBRARY_PATH (on Sun Solaris
or Linux®), or SHLIB_PATH (on HP-UX) JVM environment entries are
set. The required JVM environment entries might be different for other
databases.
You should also make sure that the user that is being used to run the
application server has the proper permissions to access the native
libraries. Incorrect permissions are another common cause of the
java.lang.UnsatisfiedLinkError.
Finally, remember that 32-bit native libraries cannot be used with a
64-bit application server JVM and vice versa. Mismatches between the two
can also cause the java.lang.UnsatisfiedLinkError.
- No, continue to question 6.
- If you are testing a connection to a Microsoft® SQL Server database,
does the following error occur?
java.sql.SQLException: [IBM][SQLServer JDBC Driver][SQLServer]Could
not find stored procedure 'master..xp_jdbc_open'.
- Yes, the root cause of the problem is that the
stored procedures for JTA are not installed on the SQL Server
database. These stored procedures are required for XA connections and for
using the Test Connection button. Review this technote
to resolve the problem.
- No, continue to question 7.
- If the error that you are getting is not one of the common errors
described in the previous steps, review the error message and
SQLException that occur when using the Test Connection
button. Check the properties that you have specified on the data source to
ensure that they are correct. Refer to your database documentation or your
DBA for more information about why a particular error or exception might
have occurred. An example of a typical error message that might occur is:
DSRA8040I: Failed to connect to the
DataSource. Encountered : java.lang.Exception:
java.sql.SQLException: Io exception: The Network Adapter
could not establish the connectionDSRA0010E: SQL State = null, Error Code
= 17,002 |
|
If you still cannot resolve the problem, continue to
question 8. |
|
- Finally, if none of the previous troubleshooting steps helped to
resolve the problem, use the JDBC
test tool to determine if you are able to successfully connect to the
database using the same properties that you have set in your data source
outside of WebSphere Application Server. Then you can determine whether
the problem is specific to WebSphere Application Server or if the problem
occurs with the JDBC driver independently of WebSphere Application
Server..
- If the connection from the JDBC test tool fails with the
same error that occurs within WebSphere Application Server, the problem is
not specific to your WebSphere Application Server environment. Work with
your DBA to troubleshoot the problem.
- If the connection from the JDBC test tool is successful,
continue to the MustGather
for
database connection and connection pooling problems.
Troubleshooting JMS connection
problems
- This section will help you to troubleshoot problems that occur when
using a JMS connection factory to establish a connection to a messaging
system such as WebSphere MQ. Only JMS connection problems are covered
here. Other JMS problems are discussed in the JMS TroubleShooting
document.
To troubleshoot this type of problem, you should have access to the
SystemOut.log for your application server. When troubleshooting a
JMS connection problem to MQ, it is useful to determine the MQ reason code
associated with the JMSException. In the SystemOut.log,
find the JMSException and review the stack trace. You should see
one or more linked exceptions. Find the last linked exception. You should
see a line like this:
---- Begin backtrace for Nested Throwables
com.ibm.mq.MQException: MQJE001: Completion Code 2, Reason 2009
The four-digit number is the reason code. You can review this document
to find out more about the reason code that you see. The reason code helps
you to understand the root cause of the problem.
- Are you receiving a JMSException with MQ reason code 2009?
- Yes, the root cause of the problem is that the
connection to the MQ queue manager is broken or invalid. Review this technote
to resolve the problem.
- No, continue to question 3.
- Are you receiving a JMSException with MQ reason code 2019?
- Yes, the root cause of the problem could be that
the connection pool Purge Policy is not set correctly. Review this technote
to resolve the problem.
- No, continue to question 4.
- Are you receiving the following exception?
J2CA0020E: The Connection Pool Manager
could not allocate a Managed Connection: java.lang.IllegalStateException:
Internal Error: cannot find the PoolManager Reference |
|
- Yes, the root cause of the problem is that the
application caches or fails to close JMS session objects. Review this technote
to resolve the problem.
- No, continue to question 5.
|
|
- Are either of the following authentication errors occurring when
trying to establish a JMS connection?
MSGS0508E: The JMS Server security
service was unable to authenticate userid: myuser
javax.jms.JMSSecurityException: MQJMS2013: invalid security
authentication supplied for MQQueueManager |
|
- Yes, the root cause of the problem is either that
no userid and password are passed to the queue manager, or the userid or
password is not correct. Ensure that a J2C authentication alias is
specified on your JMS connection factory. If you are using the WebSphere
JMS Provider in V5 (Embedded Messaging), ensure that the userid specified
in the J2C authentication alias is also in the user registry. See this technote
and this technote
for more information.
- No, continue to question 6.
|
|
- Is the problem that you observe (using a tool such as netstat) more
TCP/IP connections than you expect between your application server process
and an MQ queue manager?
- Yes, this is not necessarily a problem. This
condition is explained in great detail in this technote.
You should also review the technote if you want to gain an understanding
of how JMS connection pooling in WebSphere Application Server works.
- No, continue to question 7.
- Finally, if none of the previous troubleshooting steps helped to
resolve the problem, continue to the TroubleShooting
document for JMS or the MustGather
for
connection pooling problems.
Troubleshooting connection wait issues,
connection leaks, and performance problems
- This section will help you to troubleshoot connection wait issues,
including those that are caused by application connection leaks, and
general performance issues that are related to connection pooling. These
types of problems could occur for any type of backend system, including
databases, messaging systems, and enterprise information systems.
To troubleshoot this type of problem, you should have access to the
administrative console and the SystemOut.log for your application
server. It is also preferable for you to have access to the source code
for your application(s). Additionally, obtaining javacores (also known as
thread dumps) may help you to resolve the problem.
- Are you experiencing a connection wait problem? Symptoms might include
ConnectionWaitTimeoutExceptions in the SystemOut.log
file, slower performance or delays when the application tries to get a
connection, or threads hanging while waiting for a free connection (this
can be seen in a javacore). An example stack trace for a thread that is
waiting for a free connection is:
"Servlet.Engine.Transports : 2833" (TID:0x31680590,
sys_thread_t:0x78C96AA0, state:CW, native ID:0x86C7) prio=5
at java.lang.Object.wait(Native Method)
at
com.ibm.ejs.j2c.poolmanager.FreePool.queueRequest(FreePool.java(Compiled
Code))
at
com.ibm.ejs.j2c.poolmanager.FreePool.createOrWaitForConnection(FreePool.java(Compiled
Code))
at
com.ibm.ejs.j2c.poolmanager.PoolManager.reserve(PoolManager.java(Compiled
Code))
...
- Yes, continue to question 3.
- No, continue to question 8.
- This situation would occur when the connection pool is at its maximum
size (defined by the Maximum Connections property on the
connection pool), all of the connections are in use, and the application
requests more connections. The application must wait for a free connection
to become available.
Does the problem eventually clear up or does it persist until you restart
the application server?
- If the problem eventually clears up, the root cause of the
problem is likely that Maximum Connections is not set high enough
for the amount of load on the application. You should increase Maximum
Connections for the connection pool. You should conduct thorough load
testing to find the optimal value for Maximum Connections. You
can enable PMI
and monitor the connection
pool counters in the Tivoli Performance Viewer to help you tune this.
- If the problem does not clear up until you restart the
application server, continue to question 4.
- Is the application calling close() on every connection object
that it obtains from a WebSphere Application Server connection pool? When
an application gets a connection from the pool, it is considered "in use"
until the application calls close() on the connection, which then
returns the connection to the free pool. If the application does not call
close(), the connection is leaked and never returns to the free
pool. Eventually, the pool might become filled with leaked connections,
causing connection wait problems.
To determine if an application is leaking connections, you should review
the application source code. Note that the application should close
connections in a finally block to ensure that the connections are
closed even in error conditions. In addition, you can use WebSphere
Application Server tracing to identify connection leaks. This document
demonstrates how you can this.
The trace output can show you that a connection leak exists, and it can
also show exactly what application code is leaking connections. Finally,
the PMI connection
pool counters might also be useful in determining whether or not there
is a connection leak.
- If the application is failing to call close() on
all connection objects, that is the root cause of the problem. You should
fix the application code to resolve the problem.
- If you determine that the application server is not
leaking connections, continue to question 5.
- Connection wait problems and slow application performance when using
connection pooling can occur when the web container is not configured
properly. If the web container thread
pool size is set too high relative to the Maximum Connections
setting for the connection pool, resource contention for the available
connections could occur. It is also strongly recommended not to check the
isGrowable checkbox for the web container thread pool.
Is the web container thread pool size set too high or is the
isGrowable checkbox checked in your configuration?
- Yes, then this configuration is likely the root
cause of the problem. Correct the issue and then test to see if the
problem is resolved.
- No, continue to question 6.
- Other issues might cause connection wait problems. Applications should
follow the "get/use/close" pattern and close all connections immediately
after finishing using them. If an application takes too long to close
connections, or caches connections, the connections will be in use longer,
which could lead to connection wait issues. Even if the "get/use/close"
pattern is followed, connections will not return to the free pool until
the transaction in which the connection is obtained is committed. If the
transaction remains active for a long time after a connection is closed,
connection wait issues could occur. This frequently occurs in servlets
when a shareable connection is obtained in a local transaction containment
(LTC). More details about this scenario and solutions for it are
documented in this article.
Is your application following the "get/use/close" pattern and not caching
connections, without long periods of time elapsing before the transaction
is committed?
- Yes, continue to question 7.
- No, then the root cause of the problem is that
connections are "in use" for too long before returning to the free pool.
Correct the issue and then test to see if the problem is resolved.
- Are there any network or database issues at the time that the problem
occurs? Connection wait problems and performance problems might occur when
network traffic is slow or when the database server is busy or performing
slowly.
- Yes, then the root cause of the problem is that
connections are "in use" for long periods of time due to the network or
database slowdown. Correct the problem with the network or the database
server.
- No, continue to question 8.
- If you are seeing slow performance related to connection pooling
without a connection wait issue, check the SystemOut.log. Are
WSVR0605W warnings appearing in the SystemOut.log file?
- Yes, this means that there is a thread that has
been running as long as the threshold defined in the hang
detection policy. Generate javacores to analyze why the thread is
hung. This should reveal the root cause of the problem.
- No, continue to question 9.
- If none of the conditions described in this section are occurring, but
the application is performing poorly or the application server appears to
be unresponsive or hung, then use the appropriate MustGather
document for an application server hang to analyze the threads of the
application server to determine the cause of the problem.
- Finally, if none of the previous troubleshooting steps helped to
resolve the problem, continue to the MustGather
for
database connection and connection pooling problems.
Troubleshooting stale connection
problems
- This section will help you to troubleshoot problems with stale or
invalid connections in the connection pool. WebSphere Application Server
can determine that a connection is stale based on the exception that is
returned by the backend when trying to establish a connection. When there
is a stale database connection, WebSphere Application Server issues a
StaleConnectionException. For other backends, a fatal connection
error occurs. As a result, WebSphere Application Server purges the
connection pool based on the Purge Policy setting and
applications are able to recover from the problem. More information about
this type of problem can be found here.
To troubleshoot this type of problem, you should have access to the
administrative console and the SystemOut.log for your application
server. It is also preferable for you to have access to the source code
for your application(s).
- Is the backend system that you are connecting to a database, a
messaging system, or an enterprise information system?
- If it is a database, continue to question 3.
- If it is a messaging system or an enterprise information
system, continue to question 6.
- If you are seeing StaleConnectionExceptions in the
SystemOut.log, do they occur for every connection attempt or do
they occur intermittently or only after a certain time period has elapsed?
- If it occurs for every connection attempt, the root cause
of the problem is likely that the data source properties are
misconfigured. Follow the steps in the Troubleshooting JDBC
connections section.
- If it occurs intermittently or only after a certain time
period has elapsed, continue to question 4.
- Is the application handling the StaleConnectionException
properly by catching the exception and then retrying the connection?
- Yes, continue to question 5.
- No, then the root cause of the problem is that the
application is not catching the exception and retrying, so it cannot
recover from the problem. Review the documentation
on how to handle the StaleConnectionException and implement this
in your application code.
- A data source can be configured to pretest connections to ensure that
they are valid before they are allocated to the application. This is
documented here.
Have you configured your data source to do this?
- Yes, continue to question 6.
- No, the problem can probably be resolved by
configuring the data source to pretest connections before they are
allocated to the application.
- Is the Purge Policy set to EntirePool for the
connection pool? This means that when a StaleConnectionException
occurs, every free connection in the pool will be purged, which makes it
easier for the application to recover. Documentation about the Purge
Policy can be found here.
- Yes, continue to question 7.
- No, the problem symptoms can be alleviated by
changing the Purge Policy to EntirePool.
- Is the Minimum Connections property in the connection pool
set to 0? Setting it to 0 enables the pool maintenance thread to clean up
all of the connections in the pool after they are unused for more than the
Unused Timeout number of seconds. If Minimum Connections
is set to a value greater than 0, WebSphere Application Server must keep
at least that number of connections in the pool indefinitely. The longer
that a connection remains in the pool, the more susceptible it is to
becoming stale.
- Yes, continue to question 8.
- No, the problem symptoms can be alleviated by
changing the value of Minimum Connections to 0.
- Does a firewall exist between the application server and the backend
system that it is connecting to?
- Yes, the root cause of the problem is likely that
the firewall is timing out and dropping connections between the
application server and the backend. To avoid this possibility, set the
Unused Timeout to half the value of the timeout setting on the
firewall. This way, WebSphere Application Server can clean up its unused
connections before the firewall drops them.
- No, continue to question 9.
- Finally, if none of the previous troubleshooting steps helped to
resolve the problem, continue to the MustGather
for
database connection and connection pooling problems.
Miscellaneous connection
pooling problems
- Are you seeing J2CA0075W warnings in the
SystemOut.log?
J2CA0075W: An active transaction should be present while processing
method allocateMCWrapper.
- Yes, the root cause of the problem is that the
application is spawning its own threads and obtaining connections on those
threads. Review this technote
to resolve the problem.
- No, continue to question 2.
- Are you seeing J2CA0106E errors in the
SystemOut.log?
J2CA0106E: A 5.0 DataSource was attempted to be used in a WebModule
that was not level 2.3
- Yes, the root cause of the problem is that a
servlet at the 2.2 specification level is trying to use a standard data
source, which is not allowed. Review this technote
to resolve the problem.
- No, continue to question 3.
- Are you running WebSphere Application Server V5 and seeing
J2CA0122I messages in the SystemOut.log?
J2CA0122I: Resource reference jdbc/ds could not be located, so
default values of the following are used: [Resource-ref settings]
res-auth: 1
(APPLICATION)
res-isolation-level: 0 (TRANSACTION_NONE)
res-sharing-scope: true (SHAREABLE)
res-resolution-control: 999 (undefined)
- Yes, this is not necessarily a problem, but you can
prevent the J2CA0122I messages by creating a resource reference
and doing an indirect JNDI lookup of your data source or connection
factory. Review this technote
to resolve the problem.
- No, continue to question 4.
- Are you running WebSphere Application Server V6 and seeing
J2CA0294W warnings in the SystemOut.log?
J2CA0294W: Deprecated usage of direct JNDI lookup of resource
jdbc/ds. The following default values are used: [Resource-ref settings]
res-auth: 1 (APPLICATION)
res-isolation-level: 0 (TRANSACTION_NONE)
res-sharing-scope: true (SHAREABLE)
loginConfigurationName: null
loginConfigProperties: null
[Other attributes]
res-resolution-control: 999 (undefined)
isCMP1_x: false (not CMP1.x)
isJMS: false (not JMS)
- Yes, this is the V6 equivalent of the
J2CA0122I message. Again, it is not necessarily a problem and it
can be prevented by creating a resource reference and doing an indirect
JNDI lookup of your data source or connection factory. Review this technote
to resolve the problem.
Also, starting in WebSphere Application Server V6.0.2.19, the
J2CA0294W warning will only occur when tracing (with the
WAS.j2c=all trace specification) is enabled. This is addressed by
APAR PK32169.
- No, continue to question 5.
- Are you seeing the following error in the SystemOut.log?
J2CA0079E: Method getManagedConnection has detected an internal
illegal state and is throwing an IllegalStateException. The exception is:
java.lang.IllegalStateException: setManagedConnection: illegal state
exception. State = STATE_INACTIVE
- Yes, the root cause of the problem is a WebSphere
Application Server defect that only occurs when tracing is enabled. Review
this technote
for further details and to resolve the problem.
- No, continue to question 6.
- Are you seeing J2CA0086W warnings in the SystemOut.log?
J2CA0086W: Shareable connection MCWrapper id 686bbdf9 Managed
connection WSRdbManagedConnectionImpl@6156bdf9
State:STATE_TRAN_WRAPPER_INUSE from resource jdbc/ds was used within a
local transaction containment boundary.
- Yes, this means that two or more connections were
obtained within one local transaction containment (LTC). This doesn't
necessarily indicate a problem, but you should review this technote
for more details.
- No, continue to question 7.
- Are you seeing high memory usage or
java.lang.OutOfMemoryErrors due to a high statement cache size?
- Yes, the problem can probably be resolved by tuning
the statement cache size. Review this technote
for the details.
- No, continue to question 8.
- Are you seeing java.lang.ClassCastExceptions or other errors
when using vendor specific, non-standard JDBC methods in your application?
- Yes, the problem can probably be resolved by using
the WSCallHelper class, provided by WebSphere Application Server.
Refer to the WebSphere Application Server Information Center for
information on the use of this class.
- No, continue to question 9.
- Finally, if none of the previous troubleshooting steps helped to
resolve the problem, continue to the MustGather
for
database connection and connection pooling problems.
What to do next
If the preceding troubleshooting steps did not solve your problem, see
the MustGather
for
Connection Pooling problems to continue investigation. |