Slow and hung request detection

The requestTiming-1.0 feature provides diagnostic information when the duration of any request exceeds the configured threshold.

The request timing feature can track the duration of every request that is coming into the system. You can configure the feature to watch for slow and hung requests.

Slow request detection

When a request has been running for longer than configured, a warning message is written in the messages log file. Details about the request and events that made up the request are captured.
The following sample shows the log message for a request that crossed the slow request threshold (default is 10 seconds):
[12/1/14 11:58:09:629 IST] 0000001d com.ibm.ws.request.timing.SlowRequestTimer                   W TRAS0112W: Request AABjnS+lIn0_AAAAAAAAAAb has been running on thread 00000021 for at least 10003.571ms. The following stack trace shows what this thread is currently running.

 	 at java.util.HashMap.getEntry(HashMap.java:516)
	 at java.util.HashMap.get(HashMap.java:504)
	 at org.apache.derby.iapi.store.access.BackingStoreHashtable.get(Unknown Source)
	 at org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore(Unknown Source)
	 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown Source)
	 at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source)
	 at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(Unknown Source)
	 at org.apache.derby.impl.sql.execute.DeleteResultSet.setup(Unknown Source)
	 at org.apache.derby.impl.sql.execute.DeleteResultSet.open(Unknown Source)
	 at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
	 at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
	 at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
	 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source)
	 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source)
	 at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeUpdate(WSJdbcPreparedStatement.java:626)
	 at com.ibm.websphere.samples.trade.direct.TradeDirect.resetTrade(TradeDirect.java:1832)
	 at com.ibm.websphere.samples.trade.web.TradeConfigServlet.doResetTrade(TradeConfigServlet.java:65)
	 at com.ibm.websphere.samples.trade.web.TradeConfigServlet.service(TradeConfigServlet.java:348)
	 at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
	 at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1275)
	 ....
	 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121)
	 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
	 at java.lang.Thread.run(Thread.java:769)
 
Duration    Operation
10007.571ms + websphere.servlet.service | TradeWeb | TradeConfigServlet?action=resetTrade
    3.923ms       websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from holdingejb where holdingejb.account_accountid is null
    0.853ms       websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from accountprofileejb where userid like 'ru:%'
 5271.341ms +     websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from orderejb where account_accountid in (select accountid from accountejb a where a.profile_useri like 'ru:%')
The request continues to be monitored, and a further warning is logged if the request is running beyond another 10 seconds. The log messages are in the following format:
TRAS0112W: Request <(Request ID)> has been running on thread <THREADID> for at least <DURATION>. The following stack trace shows what this thread is currently running.
<STACK TRACE>
<DURATION AND OPERATIONS Table>
REQUEST ID
This same ID can be used to search for log and trace messages corresponding to the request. In particular, if you use binary logging, you can search for log and trace entries with the same requestID extension by using the binary log command.
STACK TRACE
Indicates the method that is running. In the previous sample, you can see a stack trace of the current request after TRAS0112W line.
DURATION AND OPERATIONS Table
After the stack trace, you can find the tabular format of the request that shows the duration and operation (also referred to as the event). The Duration column indicates the time that is taken by the corresponding operation of the request. The plus sign (+) indicates events within the request that are still running. The next line shows the duration without +, which indicates that the corresponding operation has completed in the specified duration. Operation shows the EVENT TYPE and CONTEXT INFO (which is optional) for that operation. For more information about event types and context information, see Event Logging.

By analyzing the messages, you can figure out why the request is slow. However, it might be difficult to determine whether the request is stuck at that point or is still running slowly. Hence you can see three messages that are logged for any slow request at the interval of specified <slowRequestThreshold>. Using the three different stack trace and request data, you get a better insight into the issue. After the third warning, no further warnings are logged about the request unless the duration of the request crosses the hung request detection threshold.

Hung request detection

If the request exceeds the default hungRequestThreshold or the configured threshold value, a warning message is written in the messages log file along with the details about the request. Details about the request and events that made up the request are captured. In case of hung request detection, a series of three thread dumps (javacores) is taken, with 1 minute delay between them. The following log message sample shows the log messages for a request that crossed the hung request detection threshold. The default duration value is 10 min. The value that is configured in the following example is 4 min.
[WARNING ] TRAS0114W: Request AAA7WlpP7l7_AAAAAAAAAAA has been running on thread 00000021 for at least 240001.015ms. The following table shows the events that have run during this request.
Duration       Operation
240001.754ms + websphere.servlet.service | TestWebApp | TestServlet?sleepTime=480000
     0.095ms       websphere.session.setAttribute | mCzBMyzMvAEnjMJJx9zQYIw | userID
     0.007ms       websphere.session.setAttribute | mCzBMyzMvAEnjMJJx9zQYIw | visitCount
If a request gets completed later, which was detected to be hanging initially, a message similar to the following example is logged: TRAS0115W: Request AAA7WlpP7l7_AAAAAAAAAAA, which was previously detected to be hung, has completed after 479999.681 s.
Note: When a request is detected to be hanging, a series of three thread dumps is initiated. After the completion of the three thread dumps, further thread dumps are created only if the new requests are detected to be hanging.

Icon that indicates the type of topic Reference topic

File name: rwlp_requesttiming.html