Usage: splat -i file [-n file] [-o file] [-k kexList] [-d[bfta]] [-l address] [-c class] [-s[acelmsS]] [-C#] [-S#] [-t start] [-T stop][-V] splat -h [topic] splat -j splat -v Flags & parameters: -i inputfile AIX trace file (REQUIRED). -n namefile File containing output of gennames command. -o outputfile File to write reports to (DEFAULT: stdout). -k kexList Comma delimited list of kernel extensions (DEFAULT: all). Valid only if a namefile has been provided. -d detail Detail can be one of: [b]asic: summary and lock detail (DEFAULT) [f]unction: basic + function detail [t]hread: basic + thread detail [a]ll: basic + function + thread detail -c class If the user supplies a decimal lock class index, splat will only report activity for locks in that class. [Class values are found in /usr/include/sys/lockname.h] -l address If the user supplies a hexidecimal lock address, splat will only report activity for the lock at that address. will filter a trace file for lock hooks containing that lock address and produce a report solely for that lock. -s critera Sort the lock, function, and thread reports by the following criteria: a : acquisitions c : percent CPU hold time e : percent elapsed hold time l : lock address, function address, or thread ID m : miss rate s : spin count S : percent CPU spin hold time (DEFAULT) -C cpus Specify the number of CPUs present for this trace. -S count The maximum number of entries in each report (DEFAULT: 10) -t starttime Time offset in seconds from the beginning of the trace to start analyzing trace data. (DEFAULT: 0.0 seconds) -T stoptime Time offset in seconds from the beginning of the trace to stop analyzing trace data. (DEFAULT: the end of the trace.) -V Enables verbose mode in splat, providing feedback while splat is analyzing a trace file. -h [topic] Help on usage or a specific topic. Valid topics are: all overview input names reports sorting -v Splat version and date of build. -j Print a list of trace hooks used by splat. SPLAT OVERVIEW -------------- Splat ( Simple Performance Lock Analysis Tool ) is a software tool which post-processes AIX trace files to produce kernel simple_lock usage reports. Splat does not produce reports for kernel complex locks or for user-level mutex locks. The following is a list of available help topics and a brief summary of each: OVERVIEW This text. INPUT AIX trace hooks required in order to acquire useful output from splat. NAMES What name utilities can be used to cause splat to map addresses to human-readable symbols. REPORTS A description of each report that splat can produce and the formulas used to calculate reported values. SORTING A list of all the available sorting options and how they are applied to splat's output. SPLAT INPUT ----------- Splat takes as primary input an AIX trace file which has been collected with the AIX trace command. Before analyzing a trace with splat, you will need to make sure that the trace is collected with an adequate set of hooks. First, kernel simple locks will not emit trace hooks without enabling lock event reporting. This is accomplished by executing the following as root: bosboot -ad /dev/hdiskX -L where "hdiskX" is the disk where hd5 resides and then rebooting. When the machine reboots, lock event reporting will be enabled. Splat requires the following trace hooks be included in a trace: 001 TRACE ON 002 TRACE OFF 106 DISPATCH 10C DISPATCH IDLE PROCESS 10E RELOCK 112 LOCK 113 UNLOCK 46D WAIT LOCK 600 HKWD_PTHREAD_SCHEDULER 603 HKWD_PTHREAD_TIMER 605 HKWD_PTHREAD_VPSLEEP 606 HKWD_PTHREAD_COND 607 HKWD_PTHREAD_MUTEX 608 HKWD_PTHREAD_RWLOCK 609 HKWD_PTHREAD_GENERAL SPLAT NAMES ----------- Splat can take the output of gennames as an optional input and use it to map lock and function addresses to human-readable symbols. The gennames system utility began shipping with AIX 4.3.1. If gennames output is not available, the output of trcnm -l may be substituted but will only map address that reside in /unix and not in other kernel extension or shared libraries. If name output is not available, splat will use the local file /usr/include/sys/lockname.h to identify lockclass symbols. Lockclasses and offsets can be used to identify a lock broadly, but not as specifically as the actual symbol. Additionally, this mapping may not be completely accurate if the AIX trace file is post-processed on a machine installed with a different version of AIX. SPLAT REPORTS ------------- The report generated by splat consists of a report summary, a lock summary report section, and a list of lock detail reports each of which may have an associated function detail and/or thread detail report. Report Summary ^^^^^^^^^^^^^^ The report summary consists of the following elements: - splat version information. - The trace command used to collect the trace. - The host that the trace was taken on. - The date that the trace was taken on. - The duration of the trace in seconds. - The estimated number of CPUs - The combined elapsed duration of the trace in seconds; ( the duration of the trace multiplied by the number of CPUs identified during the trace ). - Start time, which is the offset in seconds from the beginning of the trace that trace statistics begin to be gathered. - Stop time, which is the offset in seconds from the beginning of the trace that trace statistics stop being gathered. - Total number of acquisitions during the trace. - Ascquisitions per second, which is computed by dividing the total number of lock acquisitions by the real-time duration of the trace. - %% of Total Spin Time, this is the summmation of all lock spin hold times, divided by the combined trace duration in seconds, divided by 100. The current goal is to have this value be less thatn 10% of the total trace duration. Lock Summary ^^^^^^^^^^^^ The lock summary report has the following fields: Lock The name, lockclass or address of the lock. Acquisitions The number of succesful lock attempts for this lock, minus the number of times a thread was preempted while holding this lock. Spins The number of unsuccesful lock attempts for this lock, minus the number of times a thread was undispatched while spinning. Wait The number of unsuccesful lock attempts that resulted in the attempting thread going to sleep to wait for the lock to become available. %Miss Spins divided by Acquisitions plus Spins, multiplied by 100. %Total Acquisitions divided by the total number of all lock acquisitions, multiplied by 100. Locks/CSec Acquisitions divided by the combined elapsed duration in seconds. Percent HoldTime CPU The percent of combined elapsed trace time that threads held the lock in question while dispatched. DISPATCHED_HOLDTIME_IN_SECONDS divided by combined trace duration, multiplied by 100. Elaps(ed) The percent of combined elapsed trace time that threads held the lock while dispatched and sleeping. UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided by combined trace duration, multiplied by 100. Spin The percent of combined elapsed trace time that threads spun while waiting to acquire this lock. SPIN_HOLDTIME_IN_SECONDS divided by combined trace duration, multiplied by 100. Kernel Symbol If the output of gennames is used as optional input, splat can make a reasonable guess about which kernel extension a particular lock or function address belongs to. The lock summary report defaults to a list of ten locks, sorted in descending order by percent spin holdtime ( the tenth field ). The length of the summary report can be adjusted using the "-S" switch. The sorted order of the summary report ( and all other reports ) can be set with the "-s" switch whose options are described in the SORTING help section, "splat -h sorting". Lock Detail ^^^^^^^^^^^ The lock detail report consists of the following fields: LOCK The address (in hexidecimal) of the lock. NAME The symbol mapping for that address (if available) CLASS The lockclass name (if available) and hexidecimal offset, used to allocate this lock ( lock_alloc() kernel service ). KEX The kernel address space that splat thinks that this lock belongs in ( generated if name data is available ). Acquisitions The number of succesful lock attempts for this lock. Miss Rate The number of unsuccessful lock attempts divided by Acquisitions plus unsuccessful lock attempts, multiplied by 100. Spin Count The number of unsuccessful lock attempts. Wait Count The number of unsuccessful lock attempts that resulted in the attempting thread going to sleep to wait for the lock to become available. Busy Count The number of simple_lock_try() calls that returned busy. Seconds Held CPU The total time in seconds that this lock was held by dispatched threads. Elapsed The total time in seconds that this lock was held by both dispatched and undispatched threads. NOTE: neither of these two values should exceed the total real elapsed duration of the trace. Percent HoldTime CPU The percent of combined elapsed trace time that threads held the lock in question while dispatched. DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Elaps(ed) The percent of combined elapsed trace time that threads held the lock while dispatched and sleeping. UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Spin The percent of combined elapsed trace time that threads spun while waiting to acquire this lock. SPIN_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. %%Enabled The ratio of acquisitions of this lock that occurred with interrupts enabled to the total number of acquisitions. The number in parenthesis is the number of enabled acquisitions. %%Disabled The ratio of acquisitions of this lock that occurred with interrupts disabled to the total number of acquisitions. The number in parenthesis is the number of disabled acquisitions. SpinQ Splat keeps track of the minimum, maximum and average depth of the spin queue (the threads spinning, waiting for a lock to become available). WaitQ As with the spin queue, splat also tracks the minimum, maximum and average depth of the queue of threads waited waiting for a lock to become available). Lock Activity w/Interrupts Enabled (mSecs) Lock Activity w/Interrupts Disabled (mSecs) These two sections of the lock detail report are dumps of the raw data that splat collects for each lock, times expressed in milliseconds. The five states: LOCK, SPIN, WAIT, UNDISP(atched) and PREEMPT are the five basic states of splat's simple_lock finite state machine. The count for each state is the number of times a thread's actions resulted in a transition into that state. The durations in milliseconds show the minimum, maximum, average and total amounts of time that a lock request spent in that state. LOCK: this state represents a thread successfully acquiring a lock. SPIN: this state represents a thread unsuccessfully trying to acquire a lock. WAIT: this state represents a spinning thread (in SPIN) going to sleep (voluntarily) after exceeding the thread's spin threshold. UNDISP: this state represents a spinning thread (in SPIN) becoming undispatched (involuntarily) before exceeding the thread's spin threshold. PREEMPT: this state represents when a thread holding a lock is undispatched. Function Detail ^^^^^^^^^^^^^^^ The function detail report consists of the following fields: Function Name The name or return address of the function which called simple_lock, simple_lock_try, simple_unlock, disable_lock or unlock_enable. Acquisitions The number of succesful lock attempts for this lock. Miss Rate The number of unsuccessful lock attempts divided by Acquisitions, multiplied by 100. Spin Count The number of unsuccessful lock attempts. Wait Count The number of unsuccessful lock attempts that resulted in the attempting thread going to sleep to wait for the lock to become available. Busy Count The number of simple_lock_try() calls that returned busy. Seconds Held CPU The total time in seconds that this lock was held by dispatched threads. Elapsed The total time in seconds that this lock was held by both dispatched and undispatched threads. NOTE: neither of these two values should exceed the total real elapsed duration of the trace. Percent HoldTime CPU The percent of combined elapsed trace time that threads held the lock in question while dispatched. DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Elaps(ed) The percent of combined elapsed trace time that threads held the lock while dispatched and sleeping. UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Spin The percent of combined elapsed trace time that threads spun while waiting to acquire this lock. SPIN_HOLDTIME_IN_SECONDS divided by combined trace duration, multiplied by 100. Return Address The calling function's return address in hexidecimal. Start Address The start address of the calling function in hexidecimal. Offset The offset from the function start address in hexidecimal. Thread Detail ^^^^^^^^^^^^^ The thread detail report consists of the following fields: ThreadID Thread identifier. Acquisitions The number of succesful lock attempts for this lock. Miss Rate The number of unsuccessful lock attempts divided by Acquisitions, multiplied by 100. Spin Count The number of unsuccessful lock attempts. Wait Count The number of unsuccessful lock attempts that resulted in the attempting thread going to sleep to wait for the lock to become available. Busy Count The number of simple_lock_try() calls that returned busy. Seconds Held CPU The total time in seconds that this lock was held by dispatched threads. Elapsed The total time in seconds that this lock was held by both dispatched and undispatched threads. NOTE: neither of these two values should exceed the total real elapsed duration of the trace. Percent HoldTime CPU The percent of combined elapsed trace time that threads held the lock in question while dispatched. DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Elaps(ed) The percent of combined elapsed trace time that threads held the lock while dispatched and sleeping. UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided by trace duration, multiplied by 100. Spin The percent of combined elapsed trace time that threads spun while waiting to acquire this lock. SPIN_HOLDTIME_IN_SECONDS divided by combined trace duration, multiplied by 100. SPLAT SORTING ------------- Splat allows the user to specifiy which critera is used to sort the summary and lock detail reports using the "-s" option. The default sorting critera is to sort by percent spin hold time, which is the ratio of time that threads spent spinning for a lock compared to the combined duration of the trace. Using "-s", the sort critera can be changed to the following: a Acquisitions; the number times a thread successfully acquired a lock. c Percent CPU hold time; the ratio of CPU hold time with the combined trace duration. e Percent Elapsed hold time; the ratio of elapsed hold time with the combined trace duration. l location; the address of the lock or function, or the ID of a thread. m Miss rate; the ratio missed lock attempts with the number of acquisitions. s Spin count; the number of unsuccessful lock attempts that result in a thread spinning waiting for the lock. S Percent CPU spin hold time (default). Splat will use the specified critera to sort the lock reports in descending order.