TOC PREV



4

Using Rational Quantify

Rational Quantify: What it does

Your application's runtime performance--its speed--is one of its most visible and critical characteristics. Developing high-performance software that meets the expectations of customers is not an easy task. Complex interactions between your code, third-party libraries, the operating system, hardware, networks, and other processes make identifying the causes of slow performance difficult.

Rational® Quantify® is a powerful tool that identifies the portions of your C/C++ or Java application that dominate its execution time. Quantify gives you the insight to quickly eliminate performance problems so that your software runs faster. With Quantify, you can:

This chapter introduces the basic concepts involved in using Quantify. For complete information, see the Quantify online help system.

How Quantify works: C/C++

Unlike sampling-based profilers, Quantify reports performance data for your program without any profiler overhead. The numbers you see represent the time your program would take without Quantify. Quantify instruments and reports performance data for all the code in your program, including system and third-party libraries, shared libraries, and statically linked modules.

Quantify counts machine cycles: For C/C++ code, Quantify uses Object Code Insertion (OCI) technology to count the instructions your program executes and to compute how many cycles they require to execute. Counting cycles means that the time Quantify records in your code is independent of accidental local conditions and, assuming that the input does not change, identical from run to run. The fact that performance data is repeatable enables you to see precisely the effects of algorithm and data-structure changes.

Since Quantify counts cycles, it gives you accurate data at any scale. You do not need to create long runs or make numerous short runs to get meaningful data as you must with sampling-based profilers-one short run and you have the data. As soon as you can run a test program, you can collect meaningful performance data and establish a baseline for future comparison.

Quantify times system calls: Quantify measures the elapsed (wall clock) time of each system call made by your program and reports how long your program waited for those calls to complete. You can immediately see the effects of improved file access or reduced network delay on your program. You can optionally choose to measure system calls by the amount of time the kernel records for the process, which is the same as the time the UNIX /bin/time utility records.

Quantify distributes time accurately: Quantify distributes each function's time to its callers so you can tell at a glance which function calls were responsible for the majority of your program's time. Unlike gprof, Quantify does not make assumptions about the average cost per function. Quantify measures it directly.

How Quantify works: Java

Quantify times performance: Quantify times each method as it executes, and collects accurate data about the actual execution of your Java code. You can choose either to record elapsed wall-clock time or to measure the amount of time the kernel records for the process, like the UNIX /bin/time utility. Because data for Java code is based on timing and not counting cycles, as it is for C and C++, performance data for Java code, while reliable for a given run, is not repeatable.

Quantify distributes time accurately: Quantify distributes each method's time to its callers. This helps you detect the methods that are ultimately responsible for bottlenecks in your code.

Collecting performance data: C/C++

To collect performance data for a C/C++ program:
  1. Add quantify to the front of the link command line. For example:

    % quantify cc -g hello_world.c -o hello_world
    
  2. Run the instrumented program normally:

    % hello_world
    

    Note: On Tru64 UNIX, you can add quantify in front of the compile/link command line, or you can instrument the executable. Use the -taso option with quantify if you linked with the -taso option:

    % quantify <-taso> a.out

    You then run the instrumented program by typing:

    % a.out.pure

    Note also that on Tru64 UNIX, Quantify caches Dynamic Shared Objects (DSOs), not object files. References to linkers and link-line options in this chapter do not apply to Quantify on Tru64 UNIX.

When the program starts, Quantify prints license and support information, followed by the expected output from your program.

When the program finishes execution, Quantify transmits the performance data it collected to qv, Quantify's data-analysis program.

Interpreting the program summary: C/C++

After each dataset is transmitted, Quantify prints a program summary showing at a glance how the original, non-instrumented, program is expected to perform.

Collecting performance data: Java

To collect Java performance data, run Quantify with the -java option, as follows:

% quantify [<Quantify options>] -java <applet viewer> [<applet viewer options>] <html file>

% quantify [<Quantify options>] -java <Java executable> [<Java options>] <class>

% quantify [<Quantify options>] -java <Java executable> [<Java options>] -jar <JAR file>

% quantify [<Quantify options>] -java <exename> [<arguments to exename>]

Note: Quantify can collect line-by-line performance data or method-level data. By default, Quantify uses the line level when debug data, which is stored in class files, is available.

When Quantify starts, it prints license and support information, followed by the expected output from your program.

When the program finishes execution, Quantify transmits the performance data it collected to qv, Quantify's data-analysis program.

Interpreting the program summary: Java

After each dataset is transmitted, Quantify prints a program summary showing at a glance how the original, non-instrumented, program is expected to perform.

Using Quantify's data analysis windows

After transmitting the last dataset, Quantify displays the Control Panel. From here, you can display Quantify's data analysis windows and begin analyzing your program's performance.

The Function List window

The Function List window shows the functions that your program executed. By default, it displays all the functions in your program, sorted by their function time . This is the amount of time a function spent performing computations (compute-bound) or waiting for system calls to complete.

Sorting the function list

To sort the function list based on the various data Quantify collects, select View > Display data.

Restricting functions

To focus attention on specific types of functions, or to speed up the preparation of the function list report in large programs, you can restrict the functions shown in the report. Select View > Restrict functions.

You can restrict the list to the top 20 or top 100 functions in the list, to the functions that have annotated source, to functions that are compute-bound (make no system calls), or to functions that contribute non-zero time for a recorded data type.

The Call Graph window

The Call Graph window presents a graph of the functions called during the run. It uses lines of varying thickness to graphically depict where your program spends its time. Thicker lines correspond directly to larger amounts of time spent along a path.

The call graph helps you understand the calling structure of your program and the major call paths that contributed to the total time of the run. Using the call graph, you can quickly discover the sources of bottlenecks.

By default, Quantify expands the call paths to the top 20 functions contributing to the overall time of the program.

Using the pop-up menu

To display the pop-up menu, right-click any function in the call graph.

You can use the pop-up menu to:

Expanding and collapsing descendants

Use the pop-up menu to expand or collapse the subtrees of descendants for individual functions.

After expanding or collapsing subtrees, you can select View > Redo layout to remove any gaps that your changes create in the call graph.

The Function Detail window

The Function Detail window presents detailed performance data for a single function, showing its contribution to the overall execution of the program.

For each function, Quantify reports both the time spent in the function's own code (its function time) and the time spent in all the functions that it called (its descendants time). Quantify distributes this accumulated function+descendants time to the function's immediate caller.

Double-click a caller or descendant function to display the detail for that function.

The function time and the function+descendants time are shown as a percentage of the total accumulated time for the entire run. These percentages help you understand how this function's computation contributed to the overall time of the run. These times correspond to the thickness of the lines in the call graph.

Changing the scale and precision of data

Quantify can display the recorded data in cycles (the number of machine cycles) and in microseconds, milliseconds, or seconds.

To change the scale of data, select View > Scale factors.

To change the precision of data, select View > Precision.

Saving function detail data

To save the current function detail display to a file, select File > Save current function detail as.

To append additional function detail displays to the same file, select File > Append to current detail file.

The Annotated Source window

Quantify's Annotated Source window presents line-by-line performance data using the function's source code.

Note: The Annotated Source window is available only for files that you compile using the -g debugging option.

The numeric annotations in the margin reflect the time recorded for that line or basic block over all calls to the function. By default, Quantify shows the function time for each line, scaled as a percentage of the total function time accumulated by the function.

Changing annotations

To change annotations, use the View menu. You can select both function and function+descendants data, either in cycles or seconds and as a percentage of the function+descendants time.

Saving performance data on exit

To exit Quantify, select File > Exit Quantify. If you analyze a dataset interactively, Quantify does not automatically save the last dataset it receives. When you exit, you can save the dataset for future analysis.

By default, Quantify names dataset files to reflect the program name and its runtime process identifier. You can analyze a saved dataset at a later time by running qv, Quantify's data analysis program.

You can also save Quantify data in export format. This is a clear-text version of the data suitable for processing by scripts.

Comparing program runs with qxdiff

The qxdiff script compares two export data files and reports any changes in performance. For C or C++ programs, the results show exactly how much your program’s performance has improved. For Java code, the results indicate general performance trends. This is because C and C++ performance data, based on counting cycles, is repeatable, while Java data, based on the timing of methods, is not repeatable.

To use the qxdiff script:

  1. Save baseline performance data to an export file. Select
    File > Export Data As in any data analysis window.
  2. Change the program and run Quantify on it again.
  3. Select File > Export Data As to export the performance data for the new run.
  4. Use the qxdiff script to compare the two export data files. For example:
    % qxdiff -i testHash.pure.20790.0.qx improved_testHash.pure.20854.0.qx
    
    You can use the -i option to ignore functions that make calls to system calls.
Below is the output from this example.

Build-time options

Specify build-time options on the link line when you instrument a program with Quantify. For example:

% quantify -cache-dir=$HOME/cache -always-use-cache-dir \
cc ...
 
Commonly used build-time options
Default
-always-use-cache-dir

Specifies whether instrumented files are written to the global cache directory.

no
-cache-dir

Specifies the global cache directory.

<quantifyhome>/cache
-collection-granularity

Specifies the level of collection granularity.

line
-collector

Specifies the collect program to handle static constructors in C++ code.
Does not apply to Java.

none
-ignore-runtime-environment

Prevents the runtime Quantify environment from overriding option values used in building the program.
Does not apply to Java.

no
-linker

Specifies an alternative linker to use instead of the system linker.
Does not apply to Java.

system-dependent
-use-machine

Specifies the build-time analysis of instruction times according to a particular machine.
Does not apply to Java.

system-dependent

qv runtime options

To run qv , specify the option and the saved .qv file. For example:
% qv -write-summary-file a.out.23.qv

qv options
Default
-add-annotation

Specifies a string to add to the binary file.

none
-print-annotations

Writes the annotations to stdout.

no
-windows

Controls whether Quantify runs with the graphical interface.

yes
-write-export-file

Writes the recorded data in the dataset to a file in export format.

none
-write-summary-file

Writes the program summary for the dataset to a file.

none

Runtime options

Specify runtime options on the link line or by using the QUANTIFYOPTIONS environment variable. For example:

% setenv QUANTIFYOPTIONS "-windows=no"; a.out

Commonly used runtime options
Default
-avoid-recording-system-calls

Avoids recording specified system calls.
Does not apply to Java.

system-dependent
-measure-timed-calls

Specifies measurement for timing system calls.

elapsed-time
-record-child-process-data

Records data for child processes created by fork and vfork.
Does not apply to Java.

no
-record-system-calls

Records system calls.
Does not apply to Java.

yes
-report-excluded-time

Reports time that was excluded from the dataset.
Does not apply to Java.

0.5
-run-at-exit

Specifies a shell script to run when the program exits.

none
-run-at-save

Specifies a shell script to run each time the program saves counts.

none
-save-data-on-signals

Saves data on fatal signals.

yes
-save-thread-data

Saves composite or per-stack thread data.

composite
-write-export-file

Writes the dataset to an export file as ASCII text.

none
-write-summary-file

Writes the program summary for the dataset to a file.

/dev/tty
-windows

Specifies whether Quantify runs with the graphical interface.

yes

 

API functions: C/C++

To use Quantify API functions, include <quantifyhome>/quantify.h in your code and link with <quantifyhome>/quantify_stubs.a
 
Commonly used functions
Description
quantify_help (void)
Prints description of Quantify API functions
quantify_is_running (void)
Returns true if the executable is instrumented
quantify_print_recording_state (void)
Prints the recording state of the process
quantify_save_data (void)
Saves data from the start of the program or since last call to quantify_clear_data
quantify_save_data_to_file (char * filename)
Saves data to a file you specify
quantify_add_annotation (char * annotation)
Adds the specified string to the next saved dataset
quantify_clear_data (void)
Clears the performance data recorded to this point
quantify_<action>_recording_data (void) 1
Starts and stops recording of all data
quantify_<action>_recording_dynamic_library_data (void) 1
Starts and stops recording dynamic library data
quantify_<action>_recording_register_window_traps (void) 1
Starts and stops recording register-window-trap data
quantify_<action>_recording_system_call
(char *system_call_string) 1
Starts and stops recording specific system-call data
quantify_<action>_recording_system_calls (void) 1
Starts and stops recording of all system-call data
1 <action> is one of: start, stop, is. For example:
quantify_stop_recording_system_call

 

API methods: Java

You can call an API method from your Java code or from a debugger. Use the following syntax:

Rational.PureAPI.IsRunning()

or

import Rational.PureAPI;
   . . .
   PureAPI.IsRunning()

PureAPI is a Java class that includes all the Quantify API methods that can be used with Java code. The PureAPI class is part of a Java package called Rational.jar, which is located in <quantifyhome>.

You can run class files that include calls to PureAPI methods with or without Quantify: When you run these class files with Quantify, Quantify automatically sets CLASSPATH and LD_LIBRARY_PATH to access Rational.jar and libQProfJ.so. When you run the class files without Quantify, you must add <quantifyhome>/lib32 to your LD_LIBRARY_PATH. In addition, if you do not have a Rational.jar file in your <javahome>/jre/lib/ext directory, you must add <quantifyhome> to your CLASSPATH.

The Java API methods are as follows:

 

Java API functions: class Quantify

Description

public static int IsRunning();

Returns true if the executable is instrumented

public static int DisableRecordingData();

Disables collection of all data by Quantify

public static int StartRecordingData();

Tells Quantify to start recording all program performance data

public static int StopRecordingData();

Tells Quantify to stop recording all program performance data

public static int IsRecordingData();

Checks if Quantify is currently recording all program performance data

public static int ClearData();

Tells Quantify to clear all the data it has recorded about your program's performance to this point

public static int SaveData();

Saves all the data recorded since program start (or the last call to clearData() ) into a dataset (a .qv file)

public static int AddAnnotation(String annotation);

Tells Quantify to save the argument string in the next output datafile written by saveData()

Copyright © 1999, 2002 Rational Software Corporation. All rights reserved.


TOC PREV