Statistical Profiler
This utility is available with all Rational Apex Embedded products.
The following topics are covered in this chapter:
- Introduction
- How Cross-target Profiling is Done
- Profiling Considerations
- Configuring Profiling for a PowerPC Tornado Target
- Configuration Options
- Example 1
- Example 2
IntroductionThe Apex Statistical Profiler provides an estimate of the CPU usage by all parts of a program, including time spent in the runtime system. The profiler examines the program counter at regular intervals and keeps track of where the program is executing. Profiling shares the time-of-day timer interrupt already in use by the kernel. This makes profiling effective yet nonintrusive.
Profile a Program Using the GUI
Start in the directory viewer of the Main procedure.
- 1 . Set the profiling switch:
Control > Maintenance > Set Switch
Select PROFILING
Set its value to prof and click OK.
This switch selects linking with Profiling as the default for the view.
- 2 . Relink your program:
- 3 . Execute the program:
Select the Profiling button before pressing the OK button.
The file mon.out is created in your view.
- 4 . Display the profiling data:
Profile a Program Using the Command-Line
The command-line invocation is apex_profile and is discussed in the Rational Apex Embedded Command Reference.
Description
The Statistical Profiler interprets the monitor file produced by the execution of a program. The symbol table of the named executable is read and correlated with the monitor file (mon.out by default). For every external symbol, the percentage of time spent executing between that symbol and the next is printed and the total time.
Call counts are not supported by this profiler.
To take advantage of the Apex runtime facilities, configure your on-board timers into the Rational runtime system. Use these regular timer interrupts to perform statistical profiling with minimal overhead.
You can control profiling behavior by changing the duration of program runs, the amount of memory allocated to profiling accounting, and the frequency of the interrupts for embedded targets.
You can run multiple programs and accumulate statistics for all runs into a single monitor file. The Statistical Profiler enables you to specify multiple executables as the source of external symbols.
How Cross-target Profiling is DoneAfter downloading, the invocation of File > Run with the Profiling button selected, or apex_execute with the -profile option, sets the Profile_Enabled constant in target memory to True.
V_Start_Program calls Start_Profiling at its conclusion. Start_Profiling reads the True value of Profile_Enabled and allocates and zeroes an array to contain the PC counts and installs the profile interrupt handler for incrementing PC counts.
This array of counts is constrained by the following parameters:
unsigned short counts[Pc_End_Pc_Start/bytes-per-count];
By default, the user program text sections are profiled with 2 Bytes_Per_Count on i386 and is 4 Bytes_Per_Count on other targets.
When a profile timer interrupt occurs, a count is incremented as follows:
if PC (at time of interrupt) is within Pc_Start .. Pc_End then increment counts [(PC - Pc_Start)/Bytes_Per_Second] else don't increment any count, its not within the profiling memory range end ifThis is the extent of the profile-related processing done on the target. The remainder is carried out by apex_execute and apex_profile on the host.
When the profiled program completes (either normally or with a Time_Out or Control-C interrupt), apex_execute reads the following profile configuration parameters from target memory:
Profile_Pc_Start Profile_Counts_Base Profile_Counts_Size ((Pc_End - Pc_Start)/Bytes_Per_Count + 1) Profile_Bytes_Per_Count
Next it reads the array of PC COUNTS.
At the conclusion of reading memory, it writes the above information to the mon.out file in the current directory.
Profiling ConsiderationsNote: The Rational Apex Embedded Statistical Profiler does not support the Agilent Emulation Probe.
Profiling the Runtime Kernel
By default, profiling ignores time spent in the runtime system for Rational Apex Embedded for Rational Exec and Tornado. In this release of Apex, support for kernel profiling is not available. With Rational Apex Embedded for LynxOS, the runtime system is linked with the user program, and both are profiled together.
The Profiling Timer for Rational Exec
The default implementation of the profiler uses the kernel periodic timer interrupt as the trigger for incrementing the count associated with the current PC. When profiling is started, the Profile_Intr_Number configuration parameter in your Board Specific Processor view in v_usr_conf.2.ada is used as an index into the interrupt vector table for obtaining the address of the kernel timer interrupt handler. This entry is replaced with the address of the profile handler, Profiling_Isr(). At the conclusion of handling the profile interrupt, the profile handler restores the stack and registers as they were upon entry and transfers control to the kernel timer handler.
The default implementation has the following limitations:
- Task logic immediately following a delay is not profiled
The profiling timer is the same timer used by the kernel for resuming tasks suspended at delays or resumed due to a time-out
- Profiling the kernel sometimes produces anomalous results
- Profiling an application that repeats at specific time intervals may produce anomalous results
To produce useful results in these situations, configure a second timer (if one is available on your hardware) to be your profiling timer. Set this profiling timer to interrupt at a different and faster rate than the kernel periodic interrupt timer. Be sure that the profiling timer interrupts the application at a rate which is not a multiple of the kernel periodic interrupt timer and at a rate that is greater than twice as fast as the highest frequency in the application being profiled.
Write an ISR for a different timer interrupt. If necessary, place logic to reset the timer interrupt and rearm the timer in the Process_Interrupt subprogram. Then, modify Starting_Profiling so that it places the address of this ISR in the Interrupt Vector Table using isr_attach(). If not done elsewhere, add logic for initializing the timer to Start_Profiling.
Note: If the Count_Down timer model is used, the kernel timer can not be used for profiling.
Configuring Profiling for a PowerPC Tornado TargetThe tricky part about implementing profiling is figuring out where the program was (ie. what the PC register was) when the timer interrupt occurs. Unfortunately this information is not passed into the timer interrupt handler, so we have to search around on the stack for it. The good news is that the PC is there on the stack. The bad news is that its position can vary from BSP to BSP and from one Tornado release to another.
For profiling to work, you need to make sure that procedure Profiling_Isr in v_usr_conf.2.ada (in usr_conf.ss) is looking at the correct location for the PC. This location is denoted by the constant Pc_Offset defined just above procedure Profiling_Isr in v_usr_conf.2.ada. To determine the right Pc_Offset value for your setup, see item 3 below (but you should read 1 and 2 first).
We provide some tools and information in ada_examples.ss/prof to help you get profiling up and running on your board. Following is a list of what we provide.
- The auxclk_test program.
The three ada files auxclk.1.ada, auxclk.2.ada and auxclk_test.2.ada comprise a test which will attach a handler to the sysAuxClk timer (just as the profiler does). The attached handler records a snapshot of what the stack looks like. After the timer handling completes, auxclk_test dumps out the stack snapshot with some additional information to help you interpret it.
- Sample dumps from the auxclk_test program.
We include several "raw" dump files from auxclk_test and several which have been further annotated by hand to aid in understanding. The dump files have been given names like:
dump.mv2603.t2.raw
dump.mv1603.t1.annAfter "dump", the second item indicates the board. The third item indicates the release of Tornado and in the final item "ann" means it has been hand annotated, whereas "raw" means it is the raw output from auxclk_test.
- The magic equation for Pc_Offset.
The goal is to run auxclk_test on your board and understand the output well enough to arrive at the correct Pc_Offset to plug into v_usr_conf.2.ada. Study a few of the annotated dump files to get a feel for it, and then keep the following in mind:
- Find the PC value by finding the (large) exception context frame on the stack. The layout of this frame is the same as that of the type Esf_T in package Os_Signal (in rts_interface.ss). There are an initial 6 miscellaneous registers, then the 32 GPRs, then the MSR, the LR, the CTR and finally the PC (followed by some others). So the PC will be at an offset of 164 bytes (16#A4#) within the context stack frame.
- Make sure not to count the first frame. This frame is owned by Auxclk.Handler.
You will need to account for stack used by the real handler, called Profiling_Isr in v_usr_conf.2.ada. This handler uses 80 bytes (20 32-bit words).
- In the end, you will arrive at something like this:
frame_0 Profiling_Isr 80 bytes (substitute for Auxclk.Handler) frame_1 ??? ??? bytes ... frame_n ??? ??? bytes frame_n+1 exception context 164 bytes to PC ------------------------------------------------ Pc_Offset: ___ total bytes
Below is the raw output from auxclk_test on a mvme2603 board running Tornado2.
16:50:39 ::: [ apex_execute auxclk_test ] 16:50:43 ::: [ /ned/build/build.ss/cdaly_33.wrk/power_cross/sol/bin/a.run_vw_etdm_t2 -is_apex auxclk_test ] got to handler! I think I found the frame pointer. * indicates frame boundary. 16#290008# [ 16#0# 16#0#] : 16#290028# * 16#29000C# [ 16#4# 16#4#] : 16#26F968# 16#290010# [ 16#8# 16#8#] : 16#290040# 16#290014# [ 16#C# 16#C#] : 16#0# 16#290018# [ 16#10# 16#10#] : 16#8# 16#29001C# [ 16#14# 16#14#] : 16#FFEF98# 16#290020# [ 16#18# 16#18#] : 16#B85220# 16#290024# [ 16#1C# 16#1C#] : 16#FE000847# 16#290028# [ 16#20# 16#0#] : 16#290030# * 16#29002C# [ 16#24# 16#4#] : 16#104B3C# 16#290030# [ 16#28# 16#0#] : 16#290048# * 16#290034# [ 16#2C# 16#4#] : 16#104A70# 16#290038# [ 16#30# 16#8#] : 16#B171420# 16#29003C# [ 16#34# 16#C#] : 16#9030# 16#290040# [ 16#38# 16#10#] : 16#290058# 16#290044# [ 16#3C# 16#14#] : 16#BFFFB8# 16#290048# [ 16#40# 16#0#] : 16#290060# * 16#29004C# [ 16#44# 16#4#] : 16#1044DC# 16#290050# [ 16#48# 16#8#] : 16#9009030# 16#290054# [ 16#4C# 16#C#] : 16#16DCDC# 16#290058# [ 16#50# 16#10#] : 16#290068# 16#29005C# [ 16#54# 16#14#] : 16#BFFFE8# 16#290060# [ 16#58# 16#0#] : 16#290078# * 16#290064# [ 16#5C# 16#4#] : 16#104104# 16#290068# [ 16#60# 16#8#] : 16#10# 16#29006C# [ 16#64# 16#C#] : 16#108A24# 16#290070# [ 16#68# 16#10#] : 16#290078# 16#290074# [ 16#6C# 16#14#] : 16#B85210# 16#290078# [ 16#70# 16#0#] : 16#290088# * 16#29007C# [ 16#74# 16#4#] : 16#518# 16#290080# [ 16#78# 16#8#] : 16#EEEEEEEE# 16#290084# [ 16#7C# 16#C#] : 16#EEEEEEEE# 16#290088# [ 16#80# 16#0#] : 16#B851F8# * 16#29008C# [ 16#84# 16#4#] : 16#EEEEEEEE# 16#290090# [ 16#88# 16#8#] : 16#0# 16#290094# [ 16#8C# 16#C#] : 16#EEEEEEEE# 16#290098# [ 16#90# 16#10#] : 16#EEEEEEEE# 16#29009C# [ 16#94# 16#14#] : 16#EEEEEEEE# 16#2900A0# [ 16#98# 16#18#] : 16#26FCE8# 16#2900A4# [ 16#9C# 16#1C#] : 16#EEEEEEEE# 16#2900A8# [ 16#A0# 16#20#] : 16#0# 16#2900AC# [ 16#A4# 16#24#] : 16#0# 16#2900B0# [ 16#A8# 16#28#] : 16#1# 16#2900B4# [ 16#AC# 16#2C#] : 16#28D690# 16#2900B8# [ 16#B0# 16#30#] : 16#2830F0# 16#2900BC# [ 16#B4# 16#34#] : 16#1# 16#2900C0# [ 16#B8# 16#38#] : 16#0# 16#2900C4# [ 16#BC# 16#3C#] : 16#B# 16#2900C8# [ 16#C0# 16#40#] : 16#0# 16#2900CC# [ 16#C4# 16#44#] : 16#B85200# 16#2900D0# [ 16#C8# 16#48#] : 16#B85200# 16#2900D4# [ 16#CC# 16#4C#] : 16#0# 16#2900D8# [ 16#D0# 16#50#] : 16#EEEEEEEE# 16#2900DC# [ 16#D4# 16#54#] : 16#EEEEEEEE# 16#2900E0# [ 16#D8# 16#58#] : 16#EEEEEEEE# 16#2900E4# [ 16#DC# 16#5C#] : 16#EEEEEEEE# 16#2900E8# [ 16#E0# 16#60#] : 16#EEEEEEEE# 16#2900EC# [ 16#E4# 16#64#] : 16#EEEEEEEE# 16#2900F0# [ 16#E8# 16#68#] : 16#EEEEEEEE# 16#2900F4# [ 16#EC# 16#6C#] : 16#EEEEEEEE# 16#2900F8# [ 16#F0# 16#70#] : 16#EEEEEEEE# 16#2900FC# [ 16#F4# 16#74#] : 16#EEEEEEEE# 16#290100# [ 16#F8# 16#78#] : 16#EEEEEEEE# 16#290104# [ 16#FC# 16#7C#] : 16#EEEEEEEE# 16#290108# [ 16#100# 16#80#] : 16#EEEEEEEE# 16#29010C# [ 16#104# 16#84#] : 16#EEEEEEEE# 16#290110# [ 16#108# 16#88#] : 16#EEEEEEEE# 16#290114# [ 16#10C# 16#8C#] : 16#EEEEEEEE# 16#290118# [ 16#110# 16#90#] : 16#EEEEEEEE# 16#29011C# [ 16#114# 16#94#] : 16#EEEEEEEE# 16#290120# [ 16#118# 16#98#] : 16#B030# 16#290124# [ 16#11C# 16#9C#] : 16#26FCE8# 16#290128# [ 16#120# 16#A0#] : 16#26FF14# 16#29012C# [ 16#124# 16#A4#] : 16#26FC80# 16#290130# [ 16#128# 16#A8#] : 16#42000000# 16#290134# [ 16#12C# 16#AC#] : 16#20# 16#290138# [ 16#130# 16#B0#] : 16#EEEEEEEE# 16#29013C# [ 16#134# 16#B4#] : 16#EEEEEEEE# 16#290140# [ 16#138# 16#B8#] : 16#EEEEEEEE# 16#290144# [ 16#13C# 16#BC#] : 16#EEEEEEEE# 16#290148# [ 16#140# 16#C0#] : 16#0# 16#29014C# [ 16#144# 16#C4#] : 16#8# 16#290150# [ 16#148# 16#C8#] : 16#290148# 16#290154# [ 16#14C# 16#CC#] : 16#8C0061# 16#290158# [ 16#150# 16#D0#] : 16#E00008# 16#29015C# [ 16#154# 16#D4#] : 16#FFB1F0# 16#290160# [ 16#158# 16#D8#] : 16#0# 16#290164# [ 16#15C# 16#DC#] : 16#0# 16#290168# [ 16#160# 16#E0#] : 16#0# 16#29016C# [ 16#164# 16#E4#] : 16#0# 16#290170# [ 16#168# 16#E8#] : 16#0# 16#290174# [ 16#16C# 16#EC#] : 16#0# 16#290178# [ 16#170# 16#F0#] : 16#0# 16#29017C# [ 16#174# 16#F4#] : 16#0# 16#290180# [ 16#178# 16#F8#] : 16#0# 16#290184# [ 16#17C# 16#FC#] : 16#0# 16#290188# [ 16#180# 16#100#] : 16#0# 16#29018C# [ 16#184# 16#104#] : 16#0# 16#290190# [ 16#188# 16#108#] : 16#0# 16#290194# [ 16#18C# 16#10C#] : 16#0# 16:50:55 ::: [ apex_execute has finished ]Below is the hand annotated output from auxclk_test on a mvme2603 board running Tornado2. Here is the full accounting with the value for Pc_Offset:
frame_0 Profiling_Isr 80 bytes frame_1 ??? 8 bytes frame_2 ??? 24 bytes frame_3 ??? 24 bytes frame_4 ??? 24 bytes frame_5 ??? 16 bytes frame_6 exception context 164 bytes to PC ------------------------------------------------ Pc_Offset: 340 total bytes
See below for more detail on the stack structure: 16#290008# [ 16#0# 16#0#] : 16#290028# * Auxclk.Handler - IGNORE! 16#29000C# [ 16#4# 16#4#] : 16#26F968# 16#290010# [ 16#8# 16#8#] : 16#290040# 16#290014# [ 16#C# 16#C#] : 16#0# 16#290018# [ 16#10# 16#10#] : 16#8# 16#29001C# [ 16#14# 16#14#] : 16#FFEF98# 16#290020# [ 16#18# 16#18#] : 16#B85220# 16#290024# [ 16#1C# 16#1C#] : 16#FE000847# 16#290028# [ 16#20# 16#0#] : 16#290030# * (8 bytes) 16#29002C# [ 16#24# 16#4#] : 16#104B3C# 16#290030# [ 16#28# 16#0#] : 16#290048# * (24 bytes) 16#290034# [ 16#2C# 16#4#] : 16#104A70# 16#290038# [ 16#30# 16#8#] : 16#B171420# 16#29003C# [ 16#34# 16#C#] : 16#9030# 16#290040# [ 16#38# 16#10#] : 16#290058# 16#290044# [ 16#3C# 16#14#] : 16#BFFFB8# 16#290048# [ 16#40# 16#0#] : 16#290060# * (24 bytes) 16#29004C# [ 16#44# 16#4#] : 16#1044DC# 16#290050# [ 16#48# 16#8#] : 16#9009030# 16#290054# [ 16#4C# 16#C#] : 16#16DCDC# 16#290058# [ 16#50# 16#10#] : 16#290068# 16#29005C# [ 16#54# 16#14#] : 16#BFFFE8# 16#290060# [ 16#58# 16#0#] : 16#290078# * (24 bytes) 16#290064# [ 16#5C# 16#4#] : 16#104104# 16#290068# [ 16#60# 16#8#] : 16#10# 16#29006C# [ 16#64# 16#C#] : 16#108A24# 16#290070# [ 16#68# 16#10#] : 16#290078# 16#290074# [ 16#6C# 16#14#] : 16#B85210# 16#290078# [ 16#70# 16#0#] : 16#290088# * (16 bytes) 16#29007C# [ 16#74# 16#4#] : 16#518# 16#290080# [ 16#78# 16#8#] : 16#EEEEEEEE# 16#290084# [ 16#7C# 16#C#] : 16#EEEEEEEE# 16#290088# [ 16#80# 16#0#] : 16#B851F8# * (164 bytes to PC) 16#29008C# [ 16#84# 16#4#] : 16#EEEEEEEE# 16#290090# [ 16#88# 16#8#] : 16#0# 16#290094# [ 16#8C# 16#C#] : 16#EEEEEEEE# 16#290098# [ 16#90# 16#10#] : 16#EEEEEEEE# 16#29009C# [ 16#94# 16#14#] : 16#EEEEEEEE# 16#2900A0# [ 16#98# 16#18#] : 16#26FCE8# GPR0 16#2900A4# [ 16#9C# 16#1C#] : 16#EEEEEEEE# GPR1 16#2900A8# [ 16#A0# 16#20#] : 16#0# 16#2900AC# [ 16#A4# 16#24#] : 16#0# 16#2900B0# [ 16#A8# 16#28#] : 16#1# 16#2900B4# [ 16#AC# 16#2C#] : 16#28D690# 16#2900B8# [ 16#B0# 16#30#] : 16#2830F0# 16#2900BC# [ 16#B4# 16#34#] : 16#1# 16#2900C0# [ 16#B8# 16#38#] : 16#0# 16#2900C4# [ 16#BC# 16#3C#] : 16#B# 16#2900C8# [ 16#C0# 16#40#] : 16#0# 16#2900CC# [ 16#C4# 16#44#] : 16#B85200# 16#2900D0# [ 16#C8# 16#48#] : 16#B85200# 16#2900D4# [ 16#CC# 16#4C#] : 16#0# 16#2900D8# [ 16#D0# 16#50#] : 16#EEEEEEEE# 16#2900DC# [ 16#D4# 16#54#] : 16#EEEEEEEE# 16#2900E0# [ 16#D8# 16#58#] : 16#EEEEEEEE# 16#2900E4# [ 16#DC# 16#5C#] : 16#EEEEEEEE# 16#2900E8# [ 16#E0# 16#60#] : 16#EEEEEEEE# 16#2900EC# [ 16#E4# 16#64#] : 16#EEEEEEEE# 16#2900F0# [ 16#E8# 16#68#] : 16#EEEEEEEE# 16#2900F4# [ 16#EC# 16#6C#] : 16#EEEEEEEE# 16#2900F8# [ 16#F0# 16#70#] : 16#EEEEEEEE# 16#2900FC# [ 16#F4# 16#74#] : 16#EEEEEEEE# 16#290100# [ 16#F8# 16#78#] : 16#EEEEEEEE# 16#290104# [ 16#FC# 16#7C#] : 16#EEEEEEEE# 16#290108# [ 16#100# 16#80#] : 16#EEEEEEEE# 16#29010C# [ 16#104# 16#84#] : 16#EEEEEEEE# 16#290110# [ 16#108# 16#88#] : 16#EEEEEEEE# 16#290114# [ 16#10C# 16#8C#] : 16#EEEEEEEE# 16#290118# [ 16#110# 16#90#] : 16#EEEEEEEE# GPR30 16#29011C# [ 16#114# 16#94#] : 16#EEEEEEEE# GPR31 16#290120# [ 16#118# 16#98#] : 16#B030# MSR 16#290124# [ 16#11C# 16#9C#] : 16#26FCE8# LR 16#290128# [ 16#120# 16#A0#] : 16#26FF14# CTR 16#29012C# [ 16#124# 16#A4#] : 16#26FC80# PC !!! (164 bytes down) 16#290130# [ 16#128# 16#A8#] : 16#42000000# 16#290134# [ 16#12C# 16#AC#] : 16#20# 16#290138# [ 16#130# 16#B0#] : 16#EEEEEEEE# 16#29013C# [ 16#134# 16#B4#] : 16#EEEEEEEE# 16#290140# [ 16#138# 16#B8#] : 16#EEEEEEEE# 16#290144# [ 16#13C# 16#BC#] : 16#EEEEEEEE# 16#290148# [ 16#140# 16#C0#] : 16#0# 16#29014C# [ 16#144# 16#C4#] : 16#8# 16#290150# [ 16#148# 16#C8#] : 16#290148# 16#290154# [ 16#14C# 16#CC#] : 16#8C0061# 16#290158# [ 16#150# 16#D0#] : 16#E00008# 16#29015C# [ 16#154# 16#D4#] : 16#FFB1F0# 16#290160# [ 16#158# 16#D8#] : 16#0# 16#290164# [ 16#15C# 16#DC#] : 16#0# 16#290168# [ 16#160# 16#E0#] : 16#0# 16#29016C# [ 16#164# 16#E4#] : 16#0# 16#290170# [ 16#168# 16#E8#] : 16#0# 16#290174# [ 16#16C# 16#EC#] : 16#0# 16#290178# [ 16#170# 16#F0#] : 16#0# 16#29017C# [ 16#174# 16#F4#] : 16#0# 16#290180# [ 16#178# 16#F8#] : 16#0# 16#290184# [ 16#17C# 16#FC#] : 16#0# 16#290188# [ 16#180# 16#100#] : 16#0# 16#29018C# [ 16#184# 16#104#] : 16#0# 16#290190# [ 16#188# 16#108#] : 16#0# 16#290194# [ 16#18C# 16#10C#] : 16#0# 16:50:55 ::: [ apex_execute has finished ]
Configuration Optionsv_usr_conf.2.ada in your Board Specific Processor view (in usr_conf.ss) contains the profiling configuration parameters and subprograms.
- Profile_Intr_Number
Must match the kernel timer interrupt vector number. If the kernel timer is not used, this must be the vector number of the timer used.
- Profile_Bytes_Per_Count
Number of bytes associated with each count. 2 is the minimum and default value for i386 ; 4 is the minimum and default value for other targets. Increase this value to decrease the storage required for the PC counts. However, if set too large, multiple small procedures collapse onto the same count and apex_profile can not distinguish between them. If you are only interested in procedure performance, this value can be safely increased to 8 bytes which is the minimum procedure size. If you are interested in source line profiling, keep it at the minimum instruction size
Example 1The following output was generated on an Apex Embedded MIPS system. Your exact output is dependent on your target processor.
The following is an example of profiling output from apex_profile.
%time cumsecs name 41.7 0.05 pr 33.3 0.09 pr.bb 25.0 0.12 pr.aa
Example 2You can disassemble the program using the command-line apex_profile with the –disassemble option or using the Apex Disassembler from either the GUI or command–line.
The following output was generated using the -disassemble option to apex_profile. The effect of this option is to generate a mon.list file in your current view. This file is a text file with line/time pairs, but the line information is encoded and is not directly readable.
If you disassemble the program with Compile > Show > Disassemble menu selection or the command line:
% apex disassemble pr.2.ada
A disassembly file is generated in .Rational/Compilation with the suffix .asm.
The following corresponds to the example above. The lines "Profile percent = " present information from mon.list.
Note: If the source corresponding to a line is repeated, so is the Percentage. For example, look at the line
for I in 1 .. 100000 loop Profile percent = 8.33
This generates two chunks of code.
for I in 1 .. 100000 loop Profile percent = 8.33
_R_pr.qnvjfwvsohxxnfbnu3: PR'BODY'PROLOGUE 00000: addiu t4,sp,0ffc8 t4 <- sp - 56 00004: tltu t4,s7,09 00008: addu sp,t4,$0 sp <- t4 0000c: sw s0,010(sp) 16(sp) <- s0 00010: sw s1,014(sp) 20(sp) <- s1 00014: sw fp,018(sp) 24(sp) <- fp 00018: sw ra,01c(sp) 28(sp) <- ra 0001c: addiu fp,sp,038 fp <- sp + 56 TEXT_IO.PUT_LINE ("Starting"); 00020: lui a0,00 a0 <- 000000 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00024: addiu a0,a0,00 a0 <- a0 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00028: lui a1,00 a1 <- 000000 @__DOPE$0108 0002c: addiu a1,a1,00 a1 <- a1 @__DOPE$0108 00030: lui t0,00 t0 <- 000000 @_R_put_line.h9aa51crw716j6e9jb..SA 00034: lw t0,00(t0) t0 <- 0(t0) @_R_put_line.h9aa51crw716j6e9jb..SA 00038: jalr ra,t0 -> t0, ret addr: ra 0003c: #nop A := 0.0; 00040: add t9,$0,$0 t9 <- $0 00044: sw t9,-04(fp) -4(fp) <- t9
for I in 1 .. 100000 loop Profile percent = 8.33 00048: addi s0,$0,01 s0 <- 1 AA (A); Profile percent = 16.67 0004c: lwc1 f12,-04(fp) f12 <- -4(fp) 00050: addu a1,fp,$0 a1 <- fp 00054: jal 00 -> _R_pr.qnvjfwvsohxxnfbnu3 @_R_aa.pr.5 00058: #nop 0005c: swc1 f12,-04(fp) -4(fp) <- f12 00060: nop
for I in 1 .. 100000 loop Profile percent = 8.33 00064: addiu t3,s0,01 t3 <- s0 + 1 00068: addu s0,t3,$0 s0 <- t3 0006c: lui t1,02 t1 <- 020000 00070: addiu t1,t1,086a0 t1 <- t1 - 31072 00074: slt t0,t1,s0 t0 <- t1 < s0 00078: beq t0,$0,-48 -> 04c 0007c: #nop TEXT_IO.PUT_LINE ("after aa"); 00080: lui a0,00 a0 <- 000000 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00084: addiu a0,a0,00 a0 <- a0 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00088: lui a1,00 a1 <- 000000 @__DOPE$0108 0008c: addiu a1,a1,00 a1 <- a1 @__DOPE$0108 00090: lui t8,00 t8 <- 000000 @_R_put_line.h9aa51crw716j6e9jb..SA 00094: lw t8,00(t8) t8 <- 0(t8) @_R_put_line.h9aa51crw716j6e9jb..SA 00098: jalr ra,t8 -> t8, ret addr: ra 0009c: #nop for I in 1 .. 500000 loop Profile percent = 8.33 000a0: addi s1,$0,01 s1 <- 1 BB (A); Profile percent = 8.33 000a4: lwc1 f12,-04(fp) f12 <- -4(fp) 000a8: addu a1,fp,$0 a1 <- fp 000ac: jal 00 -> _R_pr.qnvjfwvsohxxnfbnu3 @_R_bb.pr.8 000b0: #nop 000b4: swc1 f12,-04(fp) -4(fp) <- f12 000b8: nop for I in 1 .. 500000 loop Profile percent = 8.33 000bc: addiu t2,s1,01 t2 <- s1 + 1 000c0: addu s1,t2,$0 s1 <- t2 000c4: lui t9,08 t9 <- 080000 000c8: addiu t9,t9,0a120 t9 <- t9 - 24288 000cc: slt t8,t9,s1 t8 <- t9 < s1 000d0: beq t8,$0,-48 -> 0a4 000d4: #nop
if (A > 0.0) then 000d8: lwc1 f10,-04(fp) f10 <- -4(fp) 000dc: add t1,$0,$0 t1 <- $0 000e0: mtc1 t1,f8 f8 <- t1 000e4: nop 000e8: c_lt_s f8,f10 condition <- f8 < f10 000ec: nop 000f0: bc1f 52 -> 0128 000f4: #nop TEXT_IO.PUT_LINE ("Hi"); 000f8: lui a0,00 a0 <- 000000 @_R_pr.qnvjfwvsohxxnfbnu3..D1 000fc: addiu a0,a0,00 a0 <- a0 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00100: lui a1,00 a1 <- 000000 @__DOPE$0102 00104: addiu a1,a1,00 a1 <- a1 @__DOPE$0102 00108: lui t7,00 t7 <- 000000 @_R_put_line.h9aa51crw716j6e9jb..SA 0010c: lw t7,00(t7) t7 <- 0(t7) @_R_put_line.h9aa51crw716j6e9jb..SA 00110: jalr ra,t7 -> t7, ret addr: ra 00114: #nop B := A; 00118: lw t6,-04(fp) t6 <- -4(fp) 0011c: sw t6,-08(fp) -8(fp) <- t6 00120: j 0158 -> _R_pr.qnvjfwvsohxxnfbnu3+0158 @_R_pr.qnvjfwvsohxxnfbnu3 00124: #nop TEXT_IO.PUT_LINE ("Ho"); 00128: lui a0,00 a0 <- 000000 @_R_pr.qnvjfwvsohxxnfbnu3..D1 0012c: addiu a0,a0,00 a0 <- a0 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00130: lui a1,00 a1 <- 000000 @__DOPE$0102 00134: addiu a1,a1,00 a1 <- a1 @__DOPE$0102 00138: lui t5,00 t5 <- 000000 @_R_put_line.h9aa51crw716j6e9jb..SA 0013c: lw t5,00(t5) t5 <- 0(t5) @_R_put_line.h9aa51crw716j6e9jb..SA 00140: jalr ra,t5 -> t5, ret addr: ra 00144: #nop B := -A; 00148: lwc1 f4,-04(fp) f4 <- -4(fp) 0014c: neg_s f6,f4 f6 <- NEG(f4) 00150: swc1 f6,-08(fp) -8(fp) <- f6 00154: nop
TEXT_IO.PUT_LINE ("ending"); 00158: lui a0,00 a0 <- 000000 @_R_pr.qnvjfwvsohxxnfbnu3..D1 0015c: addiu a0,a0,00 a0 <- a0 @_R_pr.qnvjfwvsohxxnfbnu3..D1 00160: lui a1,00 a1 <- 000000 @__DOPE$0106 00164: addiu a1,a1,00 a1 <- a1 @__DOPE$0106 00168: lui t4,00 t4 <- 000000 @_R_put_line.h9aa51crw716j6e9jb..SA 0016c: lw t4,00(t4) t4 <- 0(t4) @_R_put_line.h9aa51crw716j6e9jb..SA 00170: jalr ra,t4 -> t4, ret addr: ra 00174: #nop PR'BODY'EPILOGUE 00178: addu t4,fp,$0 t4 <- fp 0017c: lw s0,-028(t4) s0 <- -40(t4) 00180: lw s1,-024(t4) s1 <- -36(t4) 00184: lw fp,-020(t4) fp <- -32(t4) 00188: lw ra,-01c(t4) ra <- -28(t4) 0018c: nop 00190: addu sp,t4,$0 sp <- t4 00194: jr ra 00198: #nop _R_aa.pr.5: PR.AA'BODY'PROLOGUE Profile percent = 16.67 0019c: addiu t4,sp,0fff0 t4 <- sp - 16 001a0: tltu t4,s7,09 001a4: addu sp,t4,$0 sp <- t4 001a8: sw fp,00(sp) 0(sp) <- fp 001ac: addiu fp,sp,010 fp <- sp + 16 001b0: nop AI := AI + 1.0; 001b4: lui t0,03f80 t0 <- 03f800000 001b8: mtc1 t0,f4 f4 <- t0 001bc: nop 001c0: add_s f12,f12,f4 f12 <- f12 + f4 PR.AA'BODY'EPILOGUE Profile percent = 8.33 001c4: addu t4,fp,$0 t4 <- fp 001c8: lw fp,-010(t4) fp <- -16(t4) 001cc: nop 001d0: addu sp,t4,$0 sp <- t4 001d4: jr ra 001d8: #nop
_R_bb.pr.8: PR.BB'BODY'PROLOGUE Profile percent = 16.67 001dc: addiu t4,sp,0fff0 t4 <- sp - 16 001e0: tltu t4,s7,09 001e4: addu sp,t4,$0 sp <- t4 001e8: sw fp,00(sp) 0(sp) <- fp 001ec: addiu fp,sp,010 fp <- sp + 16 001f0: nop BI := BI - 1.0; Profile percent = 0.00 001f4: lui t0,03f80 t0 <- 03f800000 001f8: mtc1 t0,f4 f4 <- t0 001fc: nop 00200: sub_s f12,f12,f4 f12 <- f12 - f4 PR.BB'BODY'EPILOGUE Profile percent = 8.33 00204: addu t4,fp,$0 t4 <- fp 00208: lw fp,-010(t4) fp <- -16(t4) 0020c: nop 00210: addu sp,t4,$0 sp <- t4 00214: jr ra 00218: #nop 0021c: nop
Rational Software Corporation http://www.rational.com support@rational.com techpubs@rational.com Copyright © 1993-2002, Rational Software Corporation. All rights reserved. |