Using Rational Quantify

4

Using Rational Quantify

Rational Quantify: What it does
Your application's run-time performance-its speed-is one of its most visible and critical characteristics. Developing high-performance software that meets the expectations of customers is not an easy task. Complex interactions between your code, third-party libraries, the operating system, hardware, networks, and other processes make identifying the causes of slow performance difficult.
Rational® Quantify® is a powerful tool that identifies the portions of your application that dominate its execution time. Quantify gives you the insight to quickly eliminate performance problems so that your software runs faster. With Quantify, you can:

Get accurate, repeatable performance data

Control how data is collected, collecting data for a small portion of your application's execution or the entire run

Compare before and after runs to see the impact of your changes on performance

Easily locate and fix only the problems with the highest potential for improving performance

Unlike sampling-based profilers, Quantify's reports do not include any overhead. The numbers you see represent the time your program would take without Quantify. Quantify instruments all the code in your program, including system and third-party libraries, shared libraries, and statically linked modules.
This chapter introduces the basic concepts involved in using Quantify. For complete information, see the Quantify online help system.
How Quantify works
Quantify counts machine cycles: Quantify uses Object Code Insertion (OCI) technology to count the instructions your program executes and to compute how many cycles they require to execute. Counting cycles means that the time Quantify records in your code is identical from run to run, assuming that the input does not change. This complete repeatability enables you to see precisely the effects of algorithm and data-structure changes.
Since Quantify counts cycles, it gives you accurate data at any scale. You do not need to create long runs or make numerous short runs to get meaningful data as you must with sampling-based profilers-one short run and you have the data. As soon as you can run a test program, you can collect meaningful performance data and establish a baseline for future comparison.
Quantify times system calls: Quantify measures the elapsed (wall clock) time of each system call made by your program and reports how long your program waited for those calls to complete. You can immediately see the effects of improved file access or reduced network delay on your program. You can optionally choose to measure system calls by the amount of time the kernel recorded for the process, much like the /bin/time UNIX utility records.
Quantify distributes time accurately: Quantify distributes each function's time to its callers so you can tell at a glance which function calls were responsible for the majority of your program's time. Unlike gprof, Quantify does not make assumptions about the average cost per function. Quantify measures it directly.
Building and running an instrumented program
Note: If Quantify has been installed on your system as a component of Rational PurifyPlus, the Rational directory contains the files purifyplus_setup.csh and purifyplus_setup.sh. Source the file that is appropriate to your shell to license Quantify for your use.
To instrument your program, add quantify to the front of the link command line. For example:
% quantify cc -g hello_world.c -o hello_world

Quantify 4.4 Solaris 2, Copyright 1993-1999 Rational Software Corp.
Instrumenting: hello_world.o Linking
Run the instrumented program normally:
% hello_world
When the program starts, Quantify prints license and support information, followed by the expected output from your program.

*When the program finishes execution, Quantify transmits the performance data it collected to qv, Quantify's data-analysis program.
Interpreting the program summary
After each dataset is transmitted, Quantify prints a program summary showing at a glance how the original, non-instrumented, program is expected to perform.

Using Quantify's data analysis windows
After transmitting the last dataset, Quantify displays the Control Panel. From here, you can display Quantify's data analysis windows and begin analyzing your program's performance.

The Function List window
The Function List window shows the functions that your program executed. By default, it displays the top 20 most expensive functions in your program, sorted by their function time. This is the amount of time a function spent performing computations (compute-bound) or waiting for system calls to complete.

Sorting the function list
To sort the function list based on the various data Quantify collects, select View > Display data.

Restricting functions
To focus attention on specific types of functions, or to speed up the preparation of the function list report in large programs, you can restrict the functions shown in the report. Select View > Restrict functions.

You can restrict the list to the top 20 or top 100 functions in the list, to the functions that have annotated source, to functions that are compute-bound (make no system calls), or to functions that contribute non-zero time for a recorded data type.
The Call Graph window
The Call Graph window presents a graph of the functions called during the run. It uses lines of varying thickness to graphically depict where your program spends its time. Thicker lines correspond directly to larger amounts of time spent along a path.
The call graph helps you understand the calling structure of your program and the major call paths that contributed to the total time of the run. Using the call graph, you can quickly discover the sources of bottlenecks.

By default, Quantify expands the call paths to the top 20 functions contributing to the overall time of the program.
Using the pop-up menu
To display the pop-up menu, right-click any function in the call graph.

You can use the pop-up menu to:

Expand and collapse the function's subtree

Locate individual caller and descendant functions

Change the focus of the call graph to the selected function

Display the annotated source code or the function detail for the selected function

Expanding and collapsing descendants
Use the pop-up menu to expand or collapse the subtrees of descendants for individual functions.

After expanding or collapsing subtrees, you can select View > Redo layout to remove any gaps that your changes create in the call graph.
The Function Detail window
The Function Detail window presents detailed performance data for a single function, showing its contribution to the overall execution of the program.
For each function, Quantify reports both the time spent in the function's own code (its function time) and the time spent in all the functions that it called (its descendants time). Quantify distributes this accumulated function+descendants time to the function's immediate caller.

Double-click a caller or descendant function to display the detail for that function.
The function time and the function+descendants time are shown as a percentage of the total accumulated time for the entire run. These percentages help you understand how this function's computation contributed to the overall time of the run. These times correspond to the thickness of the lines in the call graph.
Changing the scale and precision of data
Quantify can display the recorded data in cycles (the number of machine cycles) and in microseconds, milliseconds, or seconds.
To change the scale of data, select View > Scale factors.

To change the precision of data, select View > Precision.

Saving function detail data
To save the current function detail display to a file, select File > Save current function detail as.
To append additional function detail displays to the same file, select File > Append to current detail file.
The Annotated Source window
Quantify's Annotated Source window presents line-by-line performance data using the function's source code.
Note: The Annotated Source window is available only for files that you compile using the -g debugging option.

The numeric annotations in the margin reflect the time recorded for that line or basic block over all calls to the function. By default, Quantify shows the function time for each line, scaled as a percentage of the total function time accumulated by the function.
Changing annotations
To change annotations, use the View menu. You can select both function and function+descendants data, either in cycles or seconds and as a percentage of the function+descendants time.

Saving performance data on exit
To exit Quantify, select File > Exit Quantify. If you analyze a dataset interactively, Quantify does not automatically save the last dataset it receives. When you exit, you can save the dataset for future analysis.

By default, Quantify names dataset files to reflect the program name and its run-time process identifier. You can analyze a saved dataset at a later time by running qv, Quantify's data analysis program.
You can also save Quantify data in export format. This is a clear-text version of the data suitable for processing by scripts.
Comparing program runs with qxdiff
The qxdiff script compares two export data files from runs of an instrumented program and reports any changes in performance.
To use the qxdiff script:
Save baseline performance data to an export file. Select
File > Export Data As in any data analysis window.

Change the program and run Quantify on it again.

Select File > Export Data As to export the performance data for the new run.
Use the qxdiff script to compare the two export data files. For example:
% qxdiff -i testHash.pure.20790.0.qx improved_testHash.pure.20854.0.qx
You can use the -i option to ignore functions that make calls to system calls.
Below is the output from this example.

Build-time options
Specify build-time options on the link line when you instrument a program with Quantify. For example:
% quantify -cache-dir=$HOME/cache -always-use-cache-dir \
cc ...
Commonly used build-time options

Default

-always-use-cache-dir
Specifies whether instrumented files are written to the global cache directory

no

-cache-dir
Specifies the global cache directory

<quantifyhome>/cache

-collection-granularity
Specifies the level of collection granularity

line

-collector
Specifies the collect program to handle static constructors in C++ code

none

-ignore-runtime-environment
Prevents the run-time Quantify environment from overriding option values used in building the program

no

-linker
Specifies an alternative linker to use instead of the system linker

system-dependent

-use-machine
Specifies the build-time analysis of instruction times according to a particular machine

system-dependent

qv run-time options
To run qv, specify the option and the saved .qv file. For example:
% qv -write-summary-file a.out.23.qv
qv options

Default

-add-annotation
Specifies a string to add to the binary file

none

-print-annotations
Writes the annotations to stdout

no

-windows
Controls whether Quantify runs with the graphical interface

yes

-write-export-file
Writes the recorded data in the dataset to a file in export format

none

-write-summary-file
Writes the program summary for the dataset to a file

none

Run-time options
Specify run-time options on the link line or by using the QUANTIFYOPTIONS environment variable. For example:
% setenv QUANTIFYOPTIONS "-windows=no"; a.out  
Commonly used run-time options

Default

-avoid-recording-system-calls
Avoids recording specified system calls

system-dependent

-measure-timed-calls
Specifies measurement for timing system calls

elapsed-time

-record-child-process-data
Records data for child processes created by fork and vfork

no

-record-system-calls
Records system calls

yes

-report-excluded-time
Reports time that was excluded from the dataset

0.5

-run-at-exit
Specifies a shell script to run when the program exits

none

-run-at-save
Specifies a shell script to run each time the program saves counts

none

-save-data-on-signals
Saves data on fatal signals

yes

-save-thread-data
Saves composite or per-stack thread data

composite

-write-export-file
Writes the dataset to an export file as ASCII text

none

-write-summary-file
Writes the program summary for the dataset to a file

/dev/tty

-windows
Specifies whether Quantify runs with the graphical interface

yes

API functions
To use Quantify API functions, include <quantifyhome>/quantify.h in your code and link with <quantifyhome>/quantify_stubs.a

Commonly used functions

Description

quantify_help (void)

Prints description of Quantify API functions

quantify_is_running (void)

Returns true if the executable is instrumented

quantify_print_recording_state (void)

Prints the recording state of the process

quantify_save_data (void)

Saves data from the start of the program or since last call to quantify_clear_data

quantify_save_data_to_file (char * filename)

Saves data to a file you specify

quantify_add_annotation (char * annotation)

Adds the specified string to the next saved dataset

quantify_clear_data (void)

Clears the performance data recorded to this point

quantify_<action>_recording_data (void)¹

Starts and stops recording of all data

quantify_<action>_recording_dynamic_library_data (void)¹

Starts and stops recording dynamic library data

quantify_<action>_recording_register_window_traps (void)¹

Starts and stops recording register-window-trap data

quantify_<action>_recording_system_call
(char *system_call_string)¹

Starts and stops recording specific system-call data

quantify_<action>_recording_system_calls (void)¹

Starts and stops recording of all system-call data

¹ <action> is one of: start, stop, is. For example:
quantify_stop_recording_system_call

Copyright© 1999, 2001 Rational Software Corporation. All rights reserved.

Commonly used build-time options	Default
-always-use-cache-dir Specifies whether instrumented files are written to the global cache directory	no
-cache-dir Specifies the global cache directory	<quantifyhome>/cache
-collection-granularity Specifies the level of collection granularity	line
-collector Specifies the collect program to handle static constructors in C++ code	none
-ignore-runtime-environment Prevents the run-time Quantify environment from overriding option values used in building the program	no
-linker Specifies an alternative linker to use instead of the system linker	system-dependent
-use-machine Specifies the build-time analysis of instruction times according to a particular machine	system-dependent

qv options	Default
-add-annotation Specifies a string to add to the binary file	none
-print-annotations Writes the annotations to stdout	no
-windows Controls whether Quantify runs with the graphical interface	yes
-write-export-file Writes the recorded data in the dataset to a file in export format	none
-write-summary-file Writes the program summary for the dataset to a file	none

Commonly used run-time options	Default
-avoid-recording-system-calls Avoids recording specified system calls	system-dependent
-measure-timed-calls Specifies measurement for timing system calls	elapsed-time
-record-child-process-data Records data for child processes created by fork and vfork	no
-record-system-calls Records system calls	yes
-report-excluded-time Reports time that was excluded from the dataset	0.5
-run-at-exit Specifies a shell script to run when the program exits	none
-run-at-save Specifies a shell script to run each time the program saves counts	none
-save-data-on-signals Saves data on fatal signals	yes
-save-thread-data Saves composite or per-stack thread data	composite
-write-export-file Writes the dataset to an export file as ASCII text	none
-write-summary-file Writes the program summary for the dataset to a file	/dev/tty
-windows Specifies whether Quantify runs with the graphical interface	yes

Commonly used functions	Description
quantify_help (void)	Prints description of Quantify API functions
quantify_is_running (void)	Returns true if the executable is instrumented
quantify_print_recording_state (void)	Prints the recording state of the process
quantify_save_data (void)	Saves data from the start of the program or since last call to quantify_clear_data
quantify_save_data_to_file (char * filename)	Saves data to a file you specify
quantify_add_annotation (char * annotation)	Adds the specified string to the next saved dataset
quantify_clear_data (void)	Clears the performance data recorded to this point
quantify_<action>_recording_data (void)¹	Starts and stops recording of all data
quantify_<action>_recording_dynamic_library_data (void)¹	Starts and stops recording dynamic library data
quantify_<action>_recording_register_window_traps (void)¹	Starts and stops recording register-window-trap data
quantify_<action>_recording_system_call (char *system_call_string)¹	Starts and stops recording specific system-call data
quantify_<action>_recording_system_calls (void)¹	Starts and stops recording of all system-call data