FindRoots - a tool to analyze Java heaps

To be completed!

Overview

FindRoots is a tool for analyzing out of memory conditions in Java. It examines the Java heap and outputs the object graphs for the roots which occupy most of the space. It is useful when you have a memory leak (insert link to explanation of how you can get memory leaks in Java here!) but you have no way of knowing why objects are being kept alive.

Here is an example of some output:

*** root 0 (running total 2144159) ***

2144159 0x11c17ae8 java/lang/Class
    265277 0x11c51830 sun/misc/Launcher$AppClassLoader
        265169 0x11c51898 sun/misc/Launcher$ExtClassLoader
            358 0x11c878d8 java/util/Hashtable
                314 0x11c87898 array of references
                    146 0x11da2688 java/util/Hashtable$Entry
                        110 0x11da26b8 java/lang/String
                    116 0x140f4a18 java/util/Hashtable$Entry
            100 0x11c7a8b0 java/lang/String
            304 0x11c876c0 java/util/HashMap
                252 0x11c87620 array of references
                    100 0x11da35c8 java/util/HashMap$Entry
                    100 0x140f5868 java/util/HashMap$Entry
            242 0x11c87600 java/util/Vector
                214 0x11c875c8 array of references
                    166 0x11da1d18 java/lang/ClassLoader$NativeLibrary
                        138 0x11da1d38 java/lang/String
                            110 0x11da1d58 array of char: /T/usr/lpp/java/IBM/J1.3/bin/libSecurity
                            etc...

This output indicates that the Class object at address 0x11c17ae8 together with its children occupy 2144159 bytes of heap memory. The roots are printed with the largest one first. On the following line is a "child". Children are indented. (The word child is used loosely because an object graph is not a true tree owing to backward branches, but for the purposes of finding memory leaks it is useful to present it that way.)

To find where most of the memory is going, you simply follow the tree down. In most cases you will find a Hashtable and its children occupying most of the space. That is because Hashtables are commonly used as caches but because each Hashtable entry can be the root of quite a large tree it is easy to use a lot more space than you might expect. (Check out WeakHashMap for an alternative).


Usage

java -classpath svcdump.jar com.ibm.jvm.findroots.FindRoots [-options] <filename>

where options can be one of:

-depth=<int>
Depth of the tree for each root (default 10)
-roots=<int>|all
The number of roots to print (default 7)
-prune=<int>
Prune nodes with fewer children (default 100)
-printall
Print objects even if they have been done once
-xml
Print output in XML
-nofinal
Exclude java/lang/ref/Finalizer
-version
Print version
-sizematters
Use object sizes as basis for tree order
-verbose
Print extra info for debugging purposes
-simple
Use simple non-exact algorithm that is much quicker
-printstrings
Print the first 40 chars of arrays of char (SVC dumps only)
-help
Print extended help info and exit

these options are described in more detail below. The file may be either a binary hprof file (produced by specifying -Xrunhprof:format=b,dooom=y) or an SVC dump.

-depth=<int>

Specify the depth of the tree for each root. Children that are indented more than this number of times will be excluded from the output. This is simply to cut down on the output printed.

-roots=<int>|all

Specify the number of roots to print. Each root is printed as a separate tree. You can use all instead of a number to print all the roots.

-prune=<int>

If a child has fewer children (or occupies less space for the -sizematters case) than this it is excluded from the output. Again this is simply to reduce the amount of output generated.

-printall

Print objects even if they have already been printed. Because this is not a true tree, an object may have more than one "parent". By default it will only be printed the first time it is encountered when walking the graph. Specifying this option forces the objects to be printed every time they are encountered. Warning - haven't tested this in a while - unlikely to work!

-xml

Print in simple xml format. This is useful because Internet Explorer understands xml and can collapse branches thus making it easier to navigate the tree.

-nofinal

Exclude java/lang/ref/Finalizer from the analysis. The Finalizer class can produce misleading results. Any object with a finalize method is pointed to by one of these objects. The Finalizer objects in turn are linked in a (sometimes) very long chain. But because the link from Finalizer to target is a weak reference it should not count as something that keeps the target alive.

If you see a long chain of Finalizers as the main root, run again with this option set. This should probably be turned on by default.

-sizematters

By default the number to the left and the basis for figuring out which tree root is biggest is based on the count of all descendents. This option bases it on the space in bytes occupied by the objects and its descendents.

-simple

The default algorithm tries to determine the largest root based on the reachability from that root (ie how many "children" it has or how much space they consume for -sizematters). Calculating reachability is a very expensive process because the time taken is proportional to the square of the number of objects. For each object you have to start over and calculate all the other objects that are reachable from it. Although the FindRoots program makes use of some fancy tricks to alleviate the pain (see Graph Algorithms by Robert Sedgewick for more details) in some cases it can take many hours if not days to complete the analysis.

However in most cases it is not necessary to calculate the exact largest root. That's partly because the root is not really the most important thing anyway, the culprit is usually someway down the tree.

When this option is specified, just one depth first search of the graph is done and the biggest root output is that with the biggest difference between discovery and finishing times. In other words that with the most children (or most space) below it.

Surprisingly this is usually sufficient and is much much quicker.

-printstrings

When an array of char is encountered, print the first 40 characters.

Feedback

Please send comments and suggestions to me at dgriff@hursley.ibm.com.

Thanks!