Performance Tools Guide and Reference

Simple Performance Lock Analysis Tool (splat)

The Simple Performance Lock Analysis Tool (splat) is a software tool that generates reports on the use of synchronization locks. These include the simple and complex locks provided by the AIX kernel as well as user-level mutexes, read and write locks, and condition variables provided by the PThread library. The splat tool is not currently equipped to analyze the behavior of the Virtual Memory Manager (VMM) and PMAP locks used in the AIX kernel.

Splat Command Syntax

The syntax for the splat command is as follows:

splat [-i file] [-n file] [-o file] [-k kexList] [-d [bfta]] [-l address][-c class] [-s [acelmsS]] [-C#] [-S#] [-t start] [-T stop]

splat -h [topic]

splat -j

Flags

-i inputfile	Specifies the AIX trace log file input.
-n namefile	Specifies the file containing output of gennames command.
-o outputfile	Specifies an output file (default is stdout).
-k kex[,kex]*	Specifies the kernel extensions for which the lock activities will be reported. This flag is valid only if a name file is provided with -n.
-d detail	Specifies the level of detail of the report.
-c class	Specifies class of locks to be reported.
-l address	Specifies the address for which activity on the lock will be reported.
-s criteria	Specifies the sort order of the lock, function, and thread.
-C CPUs	Specifies the number of processors on the MP system that the trace was drawn from. The default is 1. This value is overridden if more processors are observed to be reported in the trace.
-S count	Specifies the number of items to report on for each section. The default is 10. This gives the number of locks to report in the Lock Summary and Lock Detail reports, as well as the number of functions to report in the Function Detail and threads to report in the Thread detail (the -s option specifies how the most significant locks, threads, and functions are selected).
-t starttime	Overrides the start time from the first event recorded in the trace. This flag forces the analysis to begin an event that occurs starttime seconds after the first event in the trace.
-T stoptime	Overrides the stop time from the last event recorded in the trace. This flag forces the analysis to end with an event that occurs stoptime seconds after the first event in the trace.
-j	Prints the list of IDs of the trace hooks used by the splat command.
-h topic	Prints a help message on usage or a specific topic.

Parameters

inputfile	The AIX trace log file input. This file can be a merge trace file generated using the trcrpt -r command.
namefile	File containing output of the gennames command.
outputfile	File to write reports to.
kex[,kex]*	List of kernel extensions.
detail	The detail level of the report, it can be one of the following: basic Lock summary plus lock detail (the default) function Basic plus function detail thread Basic plus thread detail all Basic plus function plus thread detail
class	Activity classes, which is a decimal value found in the /usr/include/sys/lockname.h file .
address	The address to be reported, given in hexadecimal.
criteria	Order the lock, function, and thread reports by the following criteria: a Acquisitions c Percent processor time held e Percent elapsed time held l Lock address, function address, or thread ID m Miss rate s Spin count S Percent processor spin hold time (the default)
CPUs	The number of processors on the MP system that the trace was drawn from. The default is 1. This value is overridden if more processors are observed to be reported in the trace.
count	The number of locks to report in the Lock Summary and Lock Detail reports, as well as the number of functions to report in the Function Detail and threads to report in the Thread detail. (The -s option specifies how the most significant locks, threads, and functions are selected).
starttime	The number of seconds after the first event recorded in the trace that the reporting starts.
stoptime	The number of seconds after the first event recorded in the trace that the reporting stops.
topic	Help topics, which are: all overview input names reports sorting

Measurement and Sampling

The splat tool takes as input an AIX trace log file or (for an SMP trace) a set of log files, and preferably a names file produced by the gennames command. The procedure for generating these files is shown in the trace section. When you run trace, you will usually use the flag -J splat to capture the events analyzed by splat (or without the -J flag, to capture all events). The significant trace hooks are shown in the following table:

Hook ID	Event name	Event explanation
106	HKWD_KERN_DISPATCH	The thread is dispatched from the run queue to a processor.
10C	HKWD_KERN_IDLE	The idle process is been dispatched.
10E	HKWD_KERN_RELOCK	One thread is suspended while another is dispatched; the ownership of a RunQ lock is transferred from the first to the second.
112	HKWD_KERN_LOCK	The thread attempts to secure a kernel lock; the sub-hook shows what happened.
113	HKWD_KERN_UNLOCK	A kernel lock is released.
46D	HKWD_KERN_WAITLOCK	The thread is enqueued to wait on a kernel lock.
606	HKWD_PTHREAD_COND	Operations on a Condition Variable.
607	HKWD_PTHREAD_MUTEX	Operations on a Mutex.
608	HKWD_PTHREAD_RWLOCK	Operations on a Read/Write Lock.
609	HKWD_PTHREAD_GENERAL	Operations on a PThread.

Execution, Trace, and Analysis Intervals

In some cases, you can use the trace tool to capture the entire execution of a workload, while in other cases you will capture only an interval of the execution. The execution interval is the entire time that a workload runs. This interval is arbitrarily long for server workloads that run continuously. The trace interval is the time actually captured in the trace log file by trace. The length of this trace interval is limited by how large a trace log file will fit on the file system.

In contrast, the analysis interval is the portion of the trace interval that is analyzed by the splat command. The -t and -T options indicate to the splat command to start and finish analysis some number of seconds after the first event in the trace. By default, the splat command analyzes the entire trace, so this analysis interval is the same as the trace interval.

Note

As an optimization, the splat command stops reading the trace when it finishes its analysis, so it indicates that the trace and analysis intervals end at the same time even if they do not.

To most accurately estimate the effect of lock activity on the computation, you will usually want to capture the longest trace interval that you can, and analyze that entire interval with the splat command. The -t and -T flags are usually used for debugging purposes to study the behavior of the splat command across a few events in the trace.

As a rule, either use large buffers when collecting a trace, or limit the captured events to the ones you need to run the splat command.

Trace Discontinuities

The splat command uses the events in the trace to reconstruct the activities of threads and locks in the original system. If part of the trace is missing, it is because one of the following situations exists:

Tracing was stopped at one point and restarted at a later point.
One processor fills its trace buffer and stops tracing, while other processors continue tracing.
Event records in the trace buffer were overwritten before they could be copied into the trace log file.

In any of the above cases, the splat command will not be able to correctly analyze all the events across the trace interval. The policy of splat is to finish its analysis at the first point of discontinuity in the trace, issue a warning message, and generate its report. In the first two cases, the message is as follows:

     TRACE OFF record read at 0.567201 seconds. One or more of the CPUs has
     stopped tracing. You may want to generate a longer trace using larger
     buffers and re-run splat.

In the third case, the message is as follows:

     TRACEBUFFER WRAPAROUND record read at 0.567201 seconds. The input trace
     has some records missing; splat finishes analyzing at this point. You
     may want to re-generate the trace using larger buffers and re-run splat.

Some versions of the AIX kernel or PThread library may be incompletely instrumented, so the traces will be missing events. The splat command may not provide correct results in this case.

Address-to-Name Resolution in the splat Command

The lock instrumentation in the kernel and PThread library is what captures the information for each lock event. Data addresses are used to identify locks; instruction addresses are used to identify the point of execution. These addresses are captured in the event records in the trace, and used by the splatcommand to identify the locks and the functions that operate on them.

However, these addresses are not of much use to the programmer, who would rather know the names of the lock and function declarations so that they can be located in the program source files. The conversion of names to addresses is determined by the compiler and loader, and can be captured in a file using the gennames utility. The gennames utility also captures the contents of the /usr/include/sys/lockname.h file, which declares classes of kernel locks.

This gennames output file is passed to the splat command with the -n flag. When splat reports on a kernel lock, it provides the best identification that it can.

Kernel locks that are declared are resolved by name. Locks that are created dynamically are identified by class if their class name is given when they are created. The libpthreads.a instrumentation is not equipped to capture names or classes of PThread synchronizers, so they are always identified by address only.

Examples of Generated Reports

The report generated by the splat command consists of an execution summary, a gross lock summary, and a per-lock summary, followed by a list of lock detail reports that optionally includes a function detail or a thread detail report.

Execution Summary

The following example shows a sample of the Execution summary. This report is generated by default when using the splat command.

*****************************************************************************************
splat Cmd:	splat -sa -da -S100 -i trace.cooked -n gennames -o splat.out 

Trace Cmd:   trace -C all -aj 600,603,605,606,607,608,609 -T 20000000 -L 200000000 -o CONDVAR.raw    
Trace Host:  darkwing (0054451E4C00) AIX 5.2
Trace Date:  Thu Sep 27 11:26:16 2002


Elapsed Real Time:        0.098167
Number of CPUs Traced:    1            (Observed):0
Cumulative CPU Time:      0.098167


                                                       start                    stop   
                                            --------------------    -------------------- 
trace interval     (absolute tics)                    967436752               969072535   
                   (relative tics)                            0                 1635783   
                   (absolute secs)                    58.057947               58.156114  
                   (relative secs)                     0.000000                0.098167  
analysis interval  (absolute tics)                    967436752               969072535   
                   (trace-relative tics)                      0                 1635783   
                   (self-relative tics)                       0                 1635783   
                   (absolute secs)                    58.057947               58.156114  
                   (trace-relative secs)               0.000000                0.098167  
                   (self-relative secs)                0.000000                0.098167  
**************************************************************************************

The execution summary consists of the following elements:

The splat version and build information, disclaimer, and copyright notice.
The command used to run splat.
The trace command used to collect the trace.
The host on which the trace was taken.
The date that the trace was taken.
The real-time duration of the trace expressed in seconds.
The maximum number of processors that were observed in the trace (the number specified in the trace conditions information, and the number specified on the splat command line).
The cumulative processor time, equal to the duration of the trace in seconds times the number of processors that represents the total number of seconds of processor time consumed.
A table containing the start and stop times of the trace interval, measured in tics and seconds, as absolute timestamps, from the trace records, as well as relative to the first event in the trace
The start and stop times of the analysis interval, measured in tics and seconds, as absolute timestamps as well as relative to the beginning of the trace interval and the beginning of the analysis interval.

Gross Lock Summary

The following example shows a sample of the gross lock summary report. This report is generated by default when using the splat command.

***************************************************************************************
                            Unique       Acquisitions     Acq. or Passes   Total System
                     Total  Addresses     (or Passes)         per Second   Time
                 ---------  ---------    ------------     --------------   ------------
AIX (all) Locks:       523        523         1323045         72175.7768       0.003986
            RunQ:        2          2          487178         26576.9121       0.000000
          Simple:      480        480          824898         45000.4754       0.003986
         Complex:       41         41           10969           598.3894       0.000000
 PThread CondVar:        7          6          160623          8762.4305       0.000000
           Mutex:      128        116         1927771        105165.2585      10.280745 *
          RWLock:        0          0               0             0.0000       0.000000

 ( spin time goal )
***************************************************************************************

The gross lock summary report table consists of the following columns:

Total	The number of AIX Kernel locks, followed by the number of each type of AIX Kernel lock; RunQ, Simple, and Complex. Under some conditions, this will be larger than the sum of the numbers of RunQ, Simple, and Complex locks because we may not observe enough activity on a lock to differentiate its type. This is followed by the number of PThread condition-variables, the number of PThread Mutexes, and the number of PThread Read/Write.
Unique Addresses	The number of unique addresses observed for each synchronizer type. Under some conditions, a lock will be destroyed and re-created at the same address; splat produces a separate lock detail report for each instance because the usage may be different.
Acquisitions (or Passes)	For locks, the total number of times acquired during the analysis interval; for PThread condition-variables, the total number of times the condition passed during the analysis interval.
Acq. or Passes (per Second)	Acquisitions or passes per second, which is the total number of acquisitions or passes divided by the elapsed real time of the trace.
% Total System spin Time	The cumulative time spent spinning on each synchronizer type, divided by the cumulative processor time, times 100 percent. The general goal is to spin for less than 10 percent of the processor time; a message to this effect is printed at the bottom of the table. If any of the entries in this column exceed 10 percent, they are marked with an astrisk (*).

Per-lock Summary

The following example shows a sample of the per-lock summary report. This report is generated by default when using the splat command.

*********************************************************************************************************
100 max entries, Summary sorted by Acquisitions:

                        T Acqui-
                        y sitions                           Locks or      Percent Holdtime
Lock Names,             p or                                 Passes    Real     Real     Comb     Kernel 
Class, or Address       e Passes Spins Wait %Miss   %Total   / CSec    CPU      Elapse   Spin     Symbol
**********************  * ****** ***** **** *****   ******  *********  *******  ******   *******  ******* 
PROC_INT_CLASS.0003     Q 486490 0     0    0.0000  36.7705 26539.380  5.3532  100.000   0.0000   unix 
THREAD_LOCK_CLASS.0012  S 323277 0     0    0.0000  24.4343 17635.658  6.8216    6.8216  0.0000   libc 
THREAD_LOCK_CLASS.0118  S 323094 0     0    0.0000  24.4205 17625.674  6.7887    6.7887  0.0000   libc 
ELIST_CLASS.003C        S 80453  0     0    0.0000  6.0809   4388.934  1.0564    1.0564  0.0000   unix 
ELIST_CLASS.0044        S 80419  0     0    0.0000  6.0783   4387.080  1.1299    1.1299  0.0000   unix 
tod_lock                C 10229  0     0    0.0000  0.7731    558.020  0.2212    0.2212  0.0000   unix 
LDATA_CONTROL_LOCK.0000 S 1833   0     0    0.0000  0.1385     99.995  0.0204    0.0204  0.0000   unix 
U_TIMER_CLASS.0014      S 1514   0     0    0.0000  0.1144     82.593  0.0536    0.0536  0.0000   netinet 

( ... lines omitted ... ) 

000000002FF22B70        L 368838 0    N/A   0.0000  100.000    9622.964 99.9865  99.9865  0.0000 
00000000F00C3D74        M 160625 0     0    0.0000   14.2831   8762.540 99.7702  99.7702  0.0000 
00000000200017E8        M 160625 175   0    0.1088   14.2831   8762.540 42.9371  42.9371  0.1487 
0000000020001820        V 160623 0    624   0.0000  100.000    1271.728  N/A       N/A     N/A 
00000000F00C3750        M     37 0     0    0.0000    0.0033      2.018  0.0037   0.0037  0.0000 
00000000F00C3800        M     30 0     0    0.0000    0.0027      1.637  0.0698   0.0698  0.0000 

 ( ... lines omitted ... )
************************************************************************************************

The first line indicates the maximum number of locks to report (100 in this case, but we show only 14 of the entries here) as specified by the -S 100 flag. The report also indicates that the entries are sorted by the total number of acquisitions or passes, as specified by the -sa flag. The various Kernel locks and PThread synchronizers are treated as two separate lists in this report, so the report would produce the top 100 Kernel locks sorted by acquisitions, followed by the top 100 PThread synchronizers sorted by acquisitions or passes.

The per-lock summary table consists of the following columns:

Lock Names, Class, or Address	The name, class, or address of the lock, depending on whether the splat command could map the address from a name file.
Type	The type of the lock, identified by one of the following letters: Q A RunQ lock S A simple kernel lock C A complex kernel lock M A PThread mutex V A PThread condition-variable L A PThread read/write lock
Acquisitions or Passes	The number of times that the lock was acquired or the condition passed, during the analysis interval.
Spins	The number of times the lock (or condition-variable) was spun on during the analysis interval.
Wait	The number of times that a thread was driven into a wait state for that lock or condition-variable during the analysis interval.
%Miss	The percentage of access attempts that resulted in a spin as opposed to a successful acquisition or pass.
%Total	The percentage of all acquisitions that were made to this lock, out of all acquisitions to all locks of this type. All AIX locks (RunQ, simple, and complex) are treated as being the same type for this calculation. The PThread synchronizers mutex, condition-variable, and read/write lock are all distinct types.
Locks or Passes / CSec	The number of times that the lock (or condition-variable) was acquired (or passed) divided by the cumulative processor time. This is a measure of the acquisition frequency of the lock.
Real Elapse	The percentage of the elapsed real time that the lock was held by any thread at all, whether running or suspended. Note that this definition is not applicable to condition-variables because they are not held.
Comb Spin	The percentage of the cumulative processor time that executing threads spent spinning on the lock. The PThreads library uses waiting for condition-variables, so there is no time actually spent spinning.
Kernel Symbol	The name of the kernel-extension or library (or /unix for the kernel) that the lock was defined in. This information is not recoverable for PThreads.

AIX Kernel Lock Details

By default, the splat command prints a lock detail report for each entry in the summary report. The AIX Kernel locks can be either simple or complex.

The RunQ lock is a special case of the simple lock, although its pattern of usage will differ markedly from other lock types. The splat command distinguishes it from the other simple locks to ease its analysis.

Simple and RunQ Lock Details

In an AIX SIMPLE lock report, the first line starts with either [AIX SIMPLE Lock] or [AIX RunQ lock]. If the gennames output file allows, the ADDRESS is also converted into a lock NAME and CLASS, and the containing kernel extension (KEX) is identified as well. The CLASS is printed with an eight hex-digit extension indicating how many locks of this class were allocated prior to it.

The SIMPLE lock report fields are as follows:

Acquisitions	The number of times that the lock was acquired in the analysis interval (this includes successful simple_lock_try calls).
Miss Rate	The percentage of attempts that failed to acquire the lock.
Spin Count	The number of unsuccessful attempts to acquire the lock.
Wait Count	The number of times that a thread was forced into a suspended wait state, waiting for the lock to come available.
Busy Count	The number of simple_lock_try calls that returned busy.
Seconds Held	This field contains the following sub-fields: CPU The total number of processor seconds that the lock was held by an executing thread. Elapsed The total number of elapsed seconds that the lock was held by any thread, whether running or suspended.
Percent Held	This field contains the following sub-fields: Real CPU The percentage of the cumulative processor time that the lock was held by an executing thread. Real Elapsed The percentage of the elapsed real time that the lock was held by any thread at all, either running or suspended. Comb(ined) Spin The percentage of the cumulative processor time that running threads spent spinning while trying to acquire this lock. Real Wait The percentage of elapsed real time that any thread was waiting to acquire this lock. If two or more threads are waiting simultaneously, this wait time will only be charged once. To determine how many threads were waiting simultaneously, look at the WaitQ Depth statistics.
%Enabled	The percentage of acquisitions of this lock that occurred while interrupts were enabled. In parentheses is the total number of acquisitions made while interrupts were enabled.
%Disabled	The percentage of acquisitions of this lock that occurred while interrupts were disabled. In parentheses is the total number of acquisitions made while interrupts were disabled.
SpinQ	The minimum, maximum, and average number of threads spinning on the lock, whether executing or suspended, across the analysis interval.
WaitQ	The minimum, maximum, and average number of threads waiting on the lock, across the analysis interval.

The Lock Activity with Interrupts Enabled (milliseconds) and Lock Activity with Interrupts Disabled (milliseconds) sections contain information on the time that each lock state is used by the locks.

The states that a thread can be in (with respect to a given simple or complex lock) are as follows:

(no lock reference)	The thread is running, does not hold this lock, and is not attempting to acquire this lock.
LOCK	The thread has successfully acquired the lock and is currently executing.
SPIN	The thread is executing and unsuccessfully attempting to acquire the lock.
UNDISP	The thread has become undispatched while unsuccessfully attempting to acquire the lock.
WAIT	The thread has been suspended until the lock comes available. It does not necessarily acquire the lock at that time, but instead goes back to a SPIN state.
PREEMPT	The thread is holding this lock and has become undispatched.

The Lock Activity sections of the report measure the intervals of time (in milliseconds) that each thread spends in each of the states for this lock. The columns report the number of times that a thread entered the given state, followed by the maximum, minimum, and average time that a thread spent in the state once entered, followed by the total time that all threads spent in that state. These sections distinguish whether interrupts were enabled or disabled at the time that the thread was in the given state.

A thread can acquire a lock prior to the beginning of the analysis interval and release the lock during the analysis interval. When the splat command observes the lock being released, it recognizes that the lock had been held during the analysis interval up to that point and counts the time as part of the state-machine statistics. For this reason, the state-machine statistics can report that the number of times that the lock state was entered may actually be larger than the number of acquisitions of the lock that were observed in the analysis interval.

RunQ locks are used to protect resources in the thread management logic. These locks are acquired a large number of times and are only held briefly each time. A thread need not be executing to acquire or release a RunQ lock. Further, a thread may spin on a RunQ lock, but it will not go into an UNDISP or WAIT state on the lock. You will see a dramatic difference between the statistics for RunQ versus other simple locks.

Functional Detail

The functional detail report is obtained by using the -df or -da options of splat.

The columns are defined as follows:

Function Name	The name of the function that acquired or attempted to acquire this lock (with a call to one of the functions simple_lock, simple_lock_try, simple_unlock, disable_lock, or unlock_enable), if it could be resolved.
Acquisitions	The number of times that the function was able to acquire this lock.
Miss Rate	The percentage of acquisition attempts that failed.
Spin Count	The number of unsuccessful attempts by the function to acquire this lock.
Wait Count	The number of times that any thread was forced to wait on the lock, using a call to this function to acquire the lock.
Busy Count	The number of times the function tried to acquire the lock without success (that is, calls to simple_lock_try that were returned busy).
Percent Held of Total Time	Contains the following sub-fields: CPU Percentage of the cumulative processor time that the lock was held by an executing thread that had acquired the lock through a call to this function. Elapse(d) The percentage of the elapsed real time that the lock was held by any thread at all, whether running or suspended, that had acquired the lock through a call to this function. Spin The percentage of cumulative processor time that executing threads spent spinning on the lock while trying to acquire the lock through a call to this function. Wait The percentage of elapsed real time that executing threads spent waiting for the lock while trying to acquire the lock through a call to this function.
Return Address	The return address to this calling function, in hexadecimal.
Start Address	The start address to this calling function, in hexadecimal.
Offset	The offset from the function start address to the return address, in hexadecimal.

The functions are ordered by the same sorting criterion as the locks, controlled by the -s option of splat. Further, the number of functions listed is controlled by the -S parameter. The default is the top ten functions.

Thread Detail

The Thread Detail report is obtained by using the -dt or -da options of splat.

At any point in time, a single thread is either running or it is not. When a single thread runs, it only runs on one processor. Some of the composite statistics are measured relative to the cumulative processor time when they measure activities that can happen simultaneously on more than one processor, and the magnitude of the measurements can be proportional to the number of processors in the system. In contrast, the thread statistics are generally measured relative to the elapsed real time, which is the amount of time that a single processor spends processing and the amount of time that a single thread spends in an executing or suspended state.

The Thread Detail report columns are defined as follows:

ThreadID	The thread identifier.
Acquisitions	The number of times that this thread acquired the lock.
Miss Rate	The percentage of acquisition attempts by the thread that failed to secure the lock.
Spin Count	The number of unsuccessful attempts by this thread to secure the lock.
Wait Count	The number of times that this thread was forced to wait until the lock came available.
Busy Count	The number of simple_lock_try() calls that returned busy.
Percent Held of Total Time	Consists of the following sub-fields: CPU The percentage of the elapsed real time that this thread executed while holding the lock. Elapse(d) The percentage of the elapsed real time that this thread held the lock while running or suspended. Spin The percentage of elapsed real time that this thread executed while spinning on the lock. Wait The percentage of elapsed real time that this thread spent waiting on the lock.

Complex-Lock Report

AIX Complex lock supports recursive locking, where a thread can acquire the lock more than once before releasing it, as well as differentiating between write-locking, which is exclusive, from read-locking, which is not exclusive.

This report begins with [AIX COMPLEX Lock]. Most of the entries are identical to the simple lock report, while some of them are differentiated by read/write/upgrade. For example, the SpinQ and WaitQ statistics include the minimum, maximum, and average number of threads spinning or waiting on the lock. They also include the minimum, maximum, and average number of threads attempting to acquire the lock for reading versus writing. Because an arbitrary number of threads can hold the lock for reading, the report includes the minimum, maximum, and average number of readers in the LockQ that holds the lock.

A thread may hold a lock for writing; this is exclusive and prevents any other thread from securing the lock for reading or for writing. The thread downgrades the lock by simultaneously releasing it for writing and acquiring it for reading; this allows other threads to also acquire the lock for reading. The reverse of this operation is an upgrade; if the thread holds the lock for reading and no other thread holds it as well, the thread simultaneously releases the lock for reading and acquires it for writing. The upgrade operation may require that the thread wait until other threads release their read-locks. The downgrade operation does not.

A thread may acquire the lock to some recursive depth; it must release the lock the same number of times to free it. This is useful in library code where a lock must be secured at each entry-point to the library; a thread will secure the lock once as it enters the library, and internal calls to the library entry-points simply re-secure the lock, and release it when returning from the call. The minimum, maximum, and average recursion depths of any thread holding this lock are reported in the table.

A thread holding a recursive write-lock is not allowed to downgrade it because the downgrade is intended to apply to only the last write-acquisition of the lock, and the prior acquisitions had a real reason to keep the acquisition exclusive. Instead, the lock is marked as being in the downgraded state, which is erased when the this latest acquisition is released or upgraded. A thread holding a recursive read-lock can only upgrade the latest acquisition of the lock, in which case the lock is marked as being upgraded. The thread will have to wait until the lock is released by any other threads holding it for reading. The minimum, maximum, and average recursion-depths of any thread holding this lock in an upgraded or downgraded state are reported in the table.

The Lock Activity report also breaks the time down based on what task the lock is being secured for (reading, writing, or upgrading).

No time is reported to perform a downgrade because this is performed without any contention. The upgrade state is only reported for the case where a recursive read-lock is upgraded. Otherwise, the thread activity is measured as releasing a read-lock and acquiring a write-lock.

The function and thread details also break down the acquisition, spin, and wait counts by whether the lock is to be acquired for reading or writing.

PThread Synchronizer Reports

By default, splat prints a detailed report for each PThread entry in the summary report. The PThread synchronizers are of the following types; mutex, read/write lock, and condition-variable. The mutex and read/write lock are related to the AIX complex lock. You can view the similarities in the lock detail reports. The condition-variable differs significantly from a lock, and this is reflected in the report details.

The PThread library instrumentation does not provide names or classes of synchronizers, so the addresses are the only way we have to identify them. Under certain conditions, the instrumentation can capture the return addresses of the function call stack, and these addresses are used with the gennames output to identify the call chains when these synchronizers are created. The creation and deletion times of the synchronizer can sometimes be determined as well, along with the ID of the PThread that created them.

Mutex Reports

The PThread mutex is similar to an AIX simple lock in that only one thread can acquire the lock, and is like an AIX complex lock in that it can be held recursively.

In addition to the common header information and the [PThread MUTEX] identifier, this report lists the following lock details:

Acquisitions	The number of times that the lock was acquired in the analysis interval.
Miss Rate	The percentage of attempts that failed to acquire the lock.
Spin Count	The number of unsuccessful attempts to acquire the lock.
Wait Count	The number of times that a thread was forced into a suspended wait state waiting for the lock to come available.
Busy Count	The number of trylock calls that returned busy.
Seconds Held	This field contains the following sub-fields: CPU The total number of processor seconds that the lock was held by an executing thread. Elapse(d) The total number of elapsed seconds that the lock was held, whether the thread was running or suspended.
Percent Held	This field contains the following sub-fields: Real CPU The percentage of the cumulative processor time that the lock was held by an executing thread. Real Elapsed The percentage of the elapsed real time that the lock was held by any thread, either running or suspended. Comb(ined) Spin The percentage of the cumulative processor time that running threads spent spinning while trying to acquire this lock. Real Wait The percentage of elapsed real time that any thread was waiting to acquire this lock. If two or more threads are waiting simultaneously, this wait time will only be charged once. To how many threads were waiting simultaneously, look at the WaitQ Depth statistics.
Depth	This field contains the following sub-fields: SpinQ The minimum, maximum, and average number of threads spinning on the lock, whether executing or suspended, across the analysis interval. WaitQ The minimum, maximum, and average number of threads waiting on the lock, across the analysis interval. Recursion The minimum, maximum, and average recursion depth to which each thread held the lock.

Note

If the -dt or -da options are used, splat reports the thread detail.

The columns are defined as follows:

PThreadID	The PThread identifier.
Acquisitions	The number of times that this thread acquired the lock.
Miss Rate	The percentage of acquisition attempts by the thread that failed to secure the lock.
Spin Count	The number of unsuccessful attempts by this thread to secure the lock.
Wait Count	The number of times that this thread was forced to wait until the lock came available.
Busy Count	The number of times this thread used try to acquire the lock without success (calls to simple_lock_try that returned busy).
Percent Held of Total Time	This field contains the following sub-fields: CPU The percentage of the elapsed real time that this thread executed while holding the lock. Elapse(d) The percentage of the elapsed real time that this thread held the lock while running or suspended. Spin The percentage of elapsed real time that this thread executed while spinning on the lock. Wait The percentage of elapsed real time that this thread spent waiting on the lock.

Read/Write Lock Reports

The PThread read/write lock is similar to an AIX complex lock in that it can be acquired for reading or writing; writing is exclusive in that a single thread can only acquire the lock for writing, and no other thread can hold the lock for reading or writing at that point. Reading is not exclusive, so more than one thread can hold the lock for reading. Reading is recursive in that a single thread can hold multiple read-acquisitions on the lock. Writing is not recursive.

In addition to the common header information and the [PThread RWLock] identifier, this report lists the following lock details:

Acquisitions	The number of times that the lock was acquired in the analysis interval.
Miss Rate	The percentage of attempts that failed to acquire the lock.
Spin Count	The number of unsuccessful attempts to acquire the lock.
Wait Count	The current PThread implementation does not force threads to wait for read/write locks. This reports the number of times a thread, spinning on this lock, is undispatched.
Seconds Held	This field contains the following sub-fields: CPU The total number of processor seconds that the lock was held by an executing thread. If the lock is held multiple times by the same thread, only one hold interval is counted. Elapse(d) The total number of elapsed seconds that the lock was held by any thread, whether the thread was running or suspended.
Percent Held	This field contains the following sub-fields: Real CPU The percentage of the cumulative processor time that the lock was held by any executing thread. Real Elapsed The percentage of the elapsed real time that the lock was held by any thread, either running or suspended. Comb(ined) Spin The percentage of the cumulative processor time that running threads spent spinning while trying to acquire this lock. Real Wait The percentage of elapsed real time that any thread was waiting to acquire this lock. If two or more threads are waiting simultaneously, this wait time will only be charged once. To learn how many threads were waiting simultaneously, look at the WaitQ Depth statistics.
Depth	This field contains the following sub-fields: LockQ The minimum, maximum, and average number of threads holding the lock, whether executing or suspended, across the analysis interval. This is broken down by read-acquisitions, write-acquisitions, and total acquisitions. SpinQ The minimum, maximum, and average number of threads spinning on the lock, whether executing or suspended, across the analysis interval. This is broken down by read-acquisitions, write-acquisitions, and total acquisitions. WaitQ The minimum, maximum, and average number of threads in a timed-wait state for the lock, across the analysis interval. This is broken down by read-acquisitions, write-acquisitions, and and total acquisitions.

Note

If the -dt or -da options are used, splat reports the thread as detailed below:

PThreadID	The PThread identifier.
Acquisitions	The number of times that this thread acquired the lock.
Miss Rate	The percentage of acquisition attempts by the thread that failed to secure the lock.
Spin Count	The number of unsuccessful attempts by this thread to secure the lock.
Wait Count	The number of times this thread was forced to wait until the lock came available.
Busy Count	The number of times this thread used try to acquire the lock without success (calls to simple_lock_try that returned busy).
Percent Held of Total Time	This field contains the following sub-fields: CPU The percentage of the elapsed real time that this thread executed while holding the lock. Elapse(d) The percentage of the elapsed real time that this thread held the lock while running or suspended. Spin The percentage of elapsed real time that this thread executed while spinning on the lock. Wait The percentage of elapsed real time that this thread spent waiting for the lock.

Condition-Variable Report

The PThread condition-variable is a synchronizer, but not a lock. A PThread is suspended until a signal indicates that the condition now holds.

In addition to the common header information and the [PThread CondVar] identifier, this report lists the following details:

Passes	The number of times that the condition was signaled to hold during the analysis interval.
Fail Rate	The percentage of times that the condition was tested and was not found to be true.
Spin Count	The number of times that the condition was tested and was not found to be true.
Wait Count	The number of times that a thread was forced into a suspended wait state waiting for the condition to be signaled.
Spin / Wait Time	This field contains the following sub-fields: Comb Spin The total number of processor seconds that threads spun while waiting for the condition. Comb Wait The total number of elapsed seconds that threads spent in a wait state for the condition.
Depth	This field contains the following sub-fields: SpinQ The minimum, maximum, and average number of threads spinning while waiting for the condition, across the analysis interval. WaitQ The minimum, maximum, and average number of threads waiting for the condition, across the analysis interval.

If the -dt or -da options are used, splat reports the thread detail as described below:

PThreadID	The PThread identifier.
Passes	The number of times that this thread was notified that the condition passed.
Fail Rate	The percentage of times the thread checked the condition and did not find it to be true.
Spin Count	The number of times the thread checked the condition and did not find it to be true.
Wait Count	The number of times this thread was forced to wait until the condition became true.
Percent Total Time	This field contains the following sub-fields: Spin The percentage of elapsed real time that this thread spun while testing the condition. Wait The percentage of elapsed real time that this thread spent waiting for the condition to hold.