Determining which Java thread is consuming CPU cycles on Solaris systems
 Technote (FAQ)
 
Problem
How to determine which thread(s) is consuming the CPU cycles when a Java™ process has high CPU usage for an extended period of time on a Solaris™ system.

This technote uses the data gathered from technote # 1115625, titled: "MustGather: 100% CPU Usage on Solaris". See Related links below for details.
 
Cause
If the Application Server causes a spike in CPU usage for several minutes, this technote can help you identify the Java™ code that is causing the problem.
 
Solution
Using the files attached to this technote, the following 3 steps demonstrate how to find the problem thread and the corresponding Java code.
  1. Analyze the prstat information to determine the lwp(light weight process) consuming CPU.

    The following prstat was generated with the following command:
    "prstat -mvL  1  1" . Use of different prstat parameters will give different output but can be used in a similar fashion as described below.

    PID USRNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROC/LWPID
    16365 root 36 53 0.0 0.0 0.0 0.0 18 0.0 0 178 13K 0 prstat/1
    16310 root 46 0.0 0.1 0.0 0.0 4.3 45 5.0 0 58 8 0 java/30
    16310 root 5.8 0.2 0.0 0.0 0.0 0.0 94 0.0 0 11 56 0 java/11
    16310 root 2.6 0.1 0.0 0.0 0.0 0.0 97 0.0 0 6 18 0 java/5
    5158 root 0.5 0.1 0.0 0.0 0.0 0.0 99 0.0 49 30 1K 19 .netscape.bi/1
    16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 2 1 11 0 java/23
    16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 5 1 7 0 java/13
    16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/3
    16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 2 0 java/17
    16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 4 2 6 0 java/16
    16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 0 1 1 0 java/18
    16310 root 0.1 0.1 0.0 0.0 0.0 0.0 99 0.5 99 0 87 0 java/1
    16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 12 0 java/28
    16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 1 0 java/19
    290 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.0 21 0 132 0 Xsun/1
    Total: 90 processes, 233 lwps, load averages: 1.00, 0.84, 0.66

    In the above prstat output, the third and fourth columns provide the amount of time the
    process has spent in user and system mode. Ignoring the prstat command the most CPU time is being consumed by LWPID=30.
  2. Use the LWPID of 30 to find the thread in the pstack output. The following is a snippet from the pstack output:

    ----------------- lwp# 30 / thread# 50 --------------------
    fe753aa8 ???????? (0, 5265c00, fa, 43a85d79, fe74e170, e04803fc)
    fb1025b4 ???????? (f60748, e04819b8, fe74e170, f60748, fe763ca0, 16)
    .....

    fe505288 _start (fe74e170, e1ef5d10, 0, 5, 1, fe401000) + 20
    ff36b6e0 _thread_start (f25078, 0, 0, 0, 0, 0) + 40
    ---------------------------------------------------------

    Look for "_thread_start" at the bottom of the thread stack. The first number inside the ( ) is
    the "tid". In this case the tid = f25078.
  3. Finally, search for f25078in the thread dump output. For WebSphere Application Server V5.0 and WebSphere Application Server V5.1 this is the file native_stdout.log. In the following snippet of the thread dump you will see the tid of f25078.

    "Servlet.Engine.Transports : 0" daemon prio=5 tid=0xf25078 nid=0x32 runnable [0xe0480000..0xe04819d8]
    at java.lang.System.currentTimeMillis(Native Method)
    at java.util.Date.<init>(Date.java:161)
    at org.apache.jsp._wtime._jspService(_wtime.java:96)
    at com.ibm.ws.webcontainer.jsp.runtime.HttpJspBase.service(HttpJspBase.java:89)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
    at com.ibm.ws.webcontainer.jsp.servlet.JspServlet$JspServletWrapper.service
    ...



    Alternatively, you may use the thread# 50 from the pstack output to correlate the prstat and the thread dump. This is the decimal representation of the thread's "nid" in the thread dump. Using the example above, since decimal 50 is 0x32, thread# 50 can be used to find the thread by locating "nid=0x32" in the thread dumps. This is the only method that works if the application is using the alternate threading libraries. To determine which library is being used, please refer to section "Determining which library is currently in use" at link: http://www.ibm.com/support/docview.wss?rs=180&uid=swg21107291


Analyzing the thread stack for this tid will help lead you to the code causing the high CPU usage. In this case the issue most likely comes from the JSP:
_wtime._jspService(_wtime.java:96)
 
Related information
MustGather: 100% CPU Usage on Solaris
Changing to the alternate one-to-one threading library
 
 


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > 100% CPU Usage
Operating system(s): Solaris
Software version: 6.0
Software edition:
Reference #: 1162381
IBM Group: Software Group
Modified date: May 20, 2005