The Simple Performance Lock Analysis Tool (splat) is a software tool that generates reports on the use of synchronization locks. These include the simple and complex locks provided by the AIX kernel as well as user-level mutexes, read and write locks, and condition variables provided by the PThread library. The splat tool is not currently equipped to analyze the behavior of the Virtual Memory Manager (VMM) and PMAP locks used in the AIX kernel.
The syntax for the splat command is as follows:
splat [-i file] [-n file] [-o file] [-k kexList] [-d [bfta]] [-l address][-c class] [-s [acelmsS]] [-C#] [-S#] [-t start] [-T stop]
splat -h [topic]
splat -j
The splat tool takes as input an AIX trace log file or (for an SMP trace) a set of log files, and preferably a names file produced by the gennames command. The procedure for generating these files is shown in the trace section. When you run trace, you will usually use the flag -J splat to capture the events analyzed by splat (or without the -J flag, to capture all events). The significant trace hooks are shown in the following table:
Hook ID | Event name | Event explanation |
---|---|---|
106 | HKWD_KERN_DISPATCH | The thread is dispatched from the run queue to a processor. |
10C | HKWD_KERN_IDLE | The idle process is been dispatched. |
10E | HKWD_KERN_RELOCK | One thread is suspended while another is dispatched; the ownership of a RunQ lock is transferred from the first to the second. |
112 | HKWD_KERN_LOCK | The thread attempts to secure a kernel lock; the sub-hook shows what happened. |
113 | HKWD_KERN_UNLOCK | A kernel lock is released. |
46D | HKWD_KERN_WAITLOCK | The thread is enqueued to wait on a kernel lock. |
606 | HKWD_PTHREAD_COND | Operations on a Condition Variable. |
607 | HKWD_PTHREAD_MUTEX | Operations on a Mutex. |
608 | HKWD_PTHREAD_RWLOCK | Operations on a Read/Write Lock. |
609 | HKWD_PTHREAD_GENERAL | Operations on a PThread. |
In some cases, you can use the trace tool to capture the entire execution of a workload, while in other cases you will capture only an interval of the execution. The execution interval is the entire time that a workload runs. This interval is arbitrarily long for server workloads that run continuously. The trace interval is the time actually captured in the trace log file by trace. The length of this trace interval is limited by how large a trace log file will fit on the file system.
In contrast, the analysis interval is the portion of the trace interval that is analyzed by the splat command. The -t and -T options indicate to the splat command to start and finish analysis some number of seconds after the first event in the trace. By default, the splat command analyzes the entire trace, so this analysis interval is the same as the trace interval.
To most accurately estimate the effect of lock activity on the computation, you will usually want to capture the longest trace interval that you can, and analyze that entire interval with the splat command. The -t and -T flags are usually used for debugging purposes to study the behavior of the splat command across a few events in the trace.
As a rule, either use large buffers when collecting a trace, or limit the captured events to the ones you need to run the splat command.
The splat command uses the events in the trace to reconstruct the activities of threads and locks in the original system. If part of the trace is missing, it is because one of the following situations exists:
In any of the above cases, the splat command will not be able to correctly analyze all the events across the trace interval. The policy of splat is to finish its analysis at the first point of discontinuity in the trace, issue a warning message, and generate its report. In the first two cases, the message is as follows:
TRACE OFF record read at 0.567201 seconds. One or more of the CPUs has stopped tracing. You may want to generate a longer trace using larger buffers and re-run splat.
In the third case, the message is as follows:
TRACEBUFFER WRAPAROUND record read at 0.567201 seconds. The input trace has some records missing; splat finishes analyzing at this point. You may want to re-generate the trace using larger buffers and re-run splat.
Some versions of the AIX kernel or PThread library may be incompletely instrumented, so the traces will be missing events. The splat command may not provide correct results in this case.
The lock instrumentation in the kernel and PThread library is what captures the information for each lock event. Data addresses are used to identify locks; instruction addresses are used to identify the point of execution. These addresses are captured in the event records in the trace, and used by the splatcommand to identify the locks and the functions that operate on them.
However, these addresses are not of much use to the programmer, who would rather know the names of the lock and function declarations so that they can be located in the program source files. The conversion of names to addresses is determined by the compiler and loader, and can be captured in a file using the gennames utility. The gennames utility also captures the contents of the /usr/include/sys/lockname.h file, which declares classes of kernel locks.
This gennames output file is passed to the splat command with the -n flag. When splat reports on a kernel lock, it provides the best identification that it can.
Kernel locks that are declared are resolved by name. Locks that are created dynamically are identified by class if their class name is given when they are created. The libpthreads.a instrumentation is not equipped to capture names or classes of PThread synchronizers, so they are always identified by address only.
The report generated by the splat command consists of an execution summary, a gross lock summary, and a per-lock summary, followed by a list of lock detail reports that optionally includes a function detail or a thread detail report.
The following example shows a sample of the Execution summary. This report is generated by default when using the splat command.
***************************************************************************************** splat Cmd: splat -sa -da -S100 -i trace.cooked -n gennames -o splat.out Trace Cmd: trace -C all -aj 600,603,605,606,607,608,609 -T 20000000 -L 200000000 -o CONDVAR.raw Trace Host: darkwing (0054451E4C00) AIX 5.2 Trace Date: Thu Sep 27 11:26:16 2002 Elapsed Real Time: 0.098167 Number of CPUs Traced: 1 (Observed):0 Cumulative CPU Time: 0.098167 start stop -------------------- -------------------- trace interval (absolute tics) 967436752 969072535 (relative tics) 0 1635783 (absolute secs) 58.057947 58.156114 (relative secs) 0.000000 0.098167 analysis interval (absolute tics) 967436752 969072535 (trace-relative tics) 0 1635783 (self-relative tics) 0 1635783 (absolute secs) 58.057947 58.156114 (trace-relative secs) 0.000000 0.098167 (self-relative secs) 0.000000 0.098167 **************************************************************************************
The execution summary consists of the following elements:
The following example shows a sample of the gross lock summary report. This report is generated by default when using the splat command.
*************************************************************************************** Unique Acquisitions Acq. or Passes Total System Total Addresses (or Passes) per Second Time --------- --------- ------------ -------------- ------------ AIX (all) Locks: 523 523 1323045 72175.7768 0.003986 RunQ: 2 2 487178 26576.9121 0.000000 Simple: 480 480 824898 45000.4754 0.003986 Complex: 41 41 10969 598.3894 0.000000 PThread CondVar: 7 6 160623 8762.4305 0.000000 Mutex: 128 116 1927771 105165.2585 10.280745 * RWLock: 0 0 0 0.0000 0.000000 ( spin time goal ) ***************************************************************************************
The gross lock summary report table consists of the following columns:
Total | The number of AIX Kernel locks, followed by the number of each type of AIX Kernel lock; RunQ, Simple, and Complex. Under some conditions, this will be larger than the sum of the numbers of RunQ, Simple, and Complex locks because we may not observe enough activity on a lock to differentiate its type. This is followed by the number of PThread condition-variables, the number of PThread Mutexes, and the number of PThread Read/Write. |
Unique Addresses | The number of unique addresses observed for each synchronizer type. Under some conditions, a lock will be destroyed and re-created at the same address; splat produces a separate lock detail report for each instance because the usage may be different. |
Acquisitions (or Passes) | For locks, the total number of times acquired during the analysis interval; for PThread condition-variables, the total number of times the condition passed during the analysis interval. |
Acq. or Passes (per Second) | Acquisitions or passes per second, which is the total number of acquisitions or passes divided by the elapsed real time of the trace. |
% Total System spin Time | The cumulative time spent spinning on each synchronizer type, divided by the cumulative processor time, times 100 percent. The general goal is to spin for less than 10 percent of the processor time; a message to this effect is printed at the bottom of the table. If any of the entries in this column exceed 10 percent, they are marked with an astrisk (*). |
The following example shows a sample of the per-lock summary report. This report is generated by default when using the splat command.
********************************************************************************************************* 100 max entries, Summary sorted by Acquisitions: T Acqui- y sitions Locks or Percent Holdtime Lock Names, p or Passes Real Real Comb Kernel Class, or Address e Passes Spins Wait %Miss %Total / CSec CPU Elapse Spin Symbol ********************** * ****** ***** **** ***** ****** ********* ******* ****** ******* ******* PROC_INT_CLASS.0003 Q 486490 0 0 0.0000 36.7705 26539.380 5.3532 100.000 0.0000 unix THREAD_LOCK_CLASS.0012 S 323277 0 0 0.0000 24.4343 17635.658 6.8216 6.8216 0.0000 libc THREAD_LOCK_CLASS.0118 S 323094 0 0 0.0000 24.4205 17625.674 6.7887 6.7887 0.0000 libc ELIST_CLASS.003C S 80453 0 0 0.0000 6.0809 4388.934 1.0564 1.0564 0.0000 unix ELIST_CLASS.0044 S 80419 0 0 0.0000 6.0783 4387.080 1.1299 1.1299 0.0000 unix tod_lock C 10229 0 0 0.0000 0.7731 558.020 0.2212 0.2212 0.0000 unix LDATA_CONTROL_LOCK.0000 S 1833 0 0 0.0000 0.1385 99.995 0.0204 0.0204 0.0000 unix U_TIMER_CLASS.0014 S 1514 0 0 0.0000 0.1144 82.593 0.0536 0.0536 0.0000 netinet ( ... lines omitted ... ) 000000002FF22B70 L 368838 0 N/A 0.0000 100.000 9622.964 99.9865 99.9865 0.0000 00000000F00C3D74 M 160625 0 0 0.0000 14.2831 8762.540 99.7702 99.7702 0.0000 00000000200017E8 M 160625 175 0 0.1088 14.2831 8762.540 42.9371 42.9371 0.1487 0000000020001820 V 160623 0 624 0.0000 100.000 1271.728 N/A N/A N/A 00000000F00C3750 M 37 0 0 0.0000 0.0033 2.018 0.0037 0.0037 0.0000 00000000F00C3800 M 30 0 0 0.0000 0.0027 1.637 0.0698 0.0698 0.0000 ( ... lines omitted ... ) ************************************************************************************************
The first line indicates the maximum number of locks to report (100 in this case, but we show only 14 of the entries here) as specified by the -S 100 flag. The report also indicates that the entries are sorted by the total number of acquisitions or passes, as specified by the -sa flag. The various Kernel locks and PThread synchronizers are treated as two separate lists in this report, so the report would produce the top 100 Kernel locks sorted by acquisitions, followed by the top 100 PThread synchronizers sorted by acquisitions or passes.
The per-lock summary table consists of the following columns:
Lock Names, Class, or Address | The name, class, or address of the lock, depending on whether the splat command could map the address from a name file. |
Type | The type of the lock, identified
by one of the following letters:
|
Acquisitions or Passes | The number of times that the lock was acquired or the condition passed, during the analysis interval. |
Spins | The number of times the lock (or condition-variable) was spun on during the analysis interval. |
Wait | The number of times that a thread was driven into a wait state for that lock or condition-variable during the analysis interval. |
%Miss | The percentage of access attempts that resulted in a spin as opposed to a successful acquisition or pass. |
%Total | The percentage of all acquisitions that were made to this lock, out of all acquisitions to all locks of this type. All AIX locks (RunQ, simple, and complex) are treated as being the same type for this calculation. The PThread synchronizers mutex, condition-variable, and read/write lock are all distinct types. |
Locks or Passes / CSec | The number of times that the lock (or condition-variable) was acquired (or passed) divided by the cumulative processor time. This is a measure of the acquisition frequency of the lock. |
Real Elapse | The percentage of the elapsed real time that the lock was held by any thread at all, whether running or suspended. Note that this definition is not applicable to condition-variables because they are not held. |
Comb Spin | The percentage of the cumulative processor time that executing threads spent spinning on the lock. The PThreads library uses waiting for condition-variables, so there is no time actually spent spinning. |
Kernel Symbol | The name of the kernel-extension or library (or /unix for the kernel) that the lock was defined in. This information is not recoverable for PThreads. |
By default, the splat command prints a lock detail report for each entry in the summary report. The AIX Kernel locks can be either simple or complex.
The RunQ lock is a special case of the simple lock, although its pattern of usage will differ markedly from other lock types. The splat command distinguishes it from the other simple locks to ease its analysis.
In an AIX SIMPLE lock report, the first line starts with either [AIX SIMPLE Lock] or [AIX RunQ lock]. If the gennames output file allows, the ADDRESS is also converted into a lock NAME and CLASS, and the containing kernel extension (KEX) is identified as well. The CLASS is printed with an eight hex-digit extension indicating how many locks of this class were allocated prior to it.
The SIMPLE lock report fields are as follows:
Acquisitions | The number of times that the lock was acquired in the analysis interval (this includes successful simple_lock_try calls). |
Miss Rate | The percentage of attempts that failed to acquire the lock. |
Spin Count | The number of unsuccessful attempts to acquire the lock. |
Wait Count | The number of times that a thread was forced into a suspended wait state, waiting for the lock to come available. |
Busy Count | The number of simple_lock_try calls that returned busy. |
Seconds Held | This field contains the following
sub-fields:
|
Percent Held | This field contains the following
sub-fields:
|
%Enabled | The percentage of acquisitions of this lock that occurred while interrupts were enabled. In parentheses is the total number of acquisitions made while interrupts were enabled. |
%Disabled | The percentage of acquisitions of this lock that occurred while interrupts were disabled. In parentheses is the total number of acquisitions made while interrupts were disabled. |
SpinQ | The minimum, maximum, and average number of threads spinning on the lock, whether executing or suspended, across the analysis interval. |
WaitQ | The minimum, maximum, and average number of threads waiting on the lock, across the analysis interval. |
The Lock Activity with Interrupts Enabled (milliseconds) and Lock Activity with Interrupts Disabled (milliseconds) sections contain information on the time that each lock state is used by the locks.
The states that a thread can be in (with respect to a given simple or complex lock) are as follows:
(no lock reference) | The thread is running, does not hold this lock, and is not attempting to acquire this lock. |
LOCK | The thread has successfully acquired the lock and is currently executing. |
SPIN | The thread is executing and unsuccessfully attempting to acquire the lock. |
UNDISP | The thread has become undispatched while unsuccessfully attempting to acquire the lock. |
WAIT | The thread has been suspended until the lock comes available. It does not necessarily acquire the lock at that time, but instead goes back to a SPIN state. |
PREEMPT | The thread is holding this lock and has become undispatched. |
The Lock Activity sections of the report measure the intervals of time (in milliseconds) that each thread spends in each of the states for this lock. The columns report the number of times that a thread entered the given state, followed by the maximum, minimum, and average time that a thread spent in the state once entered, followed by the total time that all threads spent in that state. These sections distinguish whether interrupts were enabled or disabled at the time that the thread was in the given state.
A thread can acquire a lock prior to the beginning of the analysis interval and release the lock during the analysis interval. When the splat command observes the lock being released, it recognizes that the lock had been held during the analysis interval up to that point and counts the time as part of the state-machine statistics. For this reason, the state-machine statistics can report that the number of times that the lock state was entered may actually be larger than the number of acquisitions of the lock that were observed in the analysis interval.
RunQ locks are used to protect resources in the thread management logic. These locks are acquired a large number of times and are only held briefly each time. A thread need not be executing to acquire or release a RunQ lock. Further, a thread may spin on a RunQ lock, but it will not go into an UNDISP or WAIT state on the lock. You will see a dramatic difference between the statistics for RunQ versus other simple locks.
The functional detail report is obtained by using the -df or -da options of splat.
The columns are defined as follows:
Function Name | The name of the function that acquired or attempted to acquire this lock (with a call to one of the functions simple_lock, simple_lock_try, simple_unlock, disable_lock, or unlock_enable), if it could be resolved. |
Acquisitions | The number of times that the function was able to acquire this lock. |
Miss Rate | The percentage of acquisition attempts that failed. |
Spin Count | The number of unsuccessful attempts by the function to acquire this lock. |
Wait Count | The number of times that any thread was forced to wait on the lock, using a call to this function to acquire the lock. |
Busy Count | The number of times the function tried to acquire the lock without success (that is, calls to simple_lock_try that were returned busy). |
Percent Held of Total Time | Contains the following sub-fields:
|
Return Address | The return address to this calling function, in hexadecimal. |
Start Address | The start address to this calling function, in hexadecimal. |
Offset | The offset from the function start address to the return address, in hexadecimal. |
The functions are ordered by the same sorting criterion as the locks, controlled by the -s option of splat. Further, the number of functions listed is controlled by the -S parameter. The default is the top ten functions.
The Thread Detail report is obtained by using the -dt or -da options of splat.
At any point in time, a single thread is either running or it is not. When a single thread runs, it only runs on one processor. Some of the composite statistics are measured relative to the cumulative processor time when they measure activities that can happen simultaneously on more than one processor, and the magnitude of the measurements can be proportional to the number of processors in the system. In contrast, the thread statistics are generally measured relative to the elapsed real time, which is the amount of time that a single processor spends processing and the amount of time that a single thread spends in an executing or suspended state.
The Thread Detail report columns are defined as follows:
ThreadID | The thread identifier. |
Acquisitions | The number of times that this thread acquired the lock. |
Miss Rate | The percentage of acquisition attempts by the thread that failed to secure the lock. |
Spin Count | The number of unsuccessful attempts by this thread to secure the lock. |
Wait Count | The number of times that this thread was forced to wait until the lock came available. |
Busy Count | The number of simple_lock_try() calls that returned busy. |
Percent Held of Total Time | Consists of the following sub-fields:
|
AIX Complex lock supports recursive locking, where a thread can acquire the lock more than once before releasing it, as well as differentiating between write-locking, which is exclusive, from read-locking, which is not exclusive.
This report begins with [AIX COMPLEX Lock]. Most of the entries are identical to the simple lock report, while some of them are differentiated by read/write/upgrade. For example, the SpinQ and WaitQ statistics include the minimum, maximum, and average number of threads spinning or waiting on the lock. They also include the minimum, maximum, and average number of threads attempting to acquire the lock for reading versus writing. Because an arbitrary number of threads can hold the lock for reading, the report includes the minimum, maximum, and average number of readers in the LockQ that holds the lock.
A thread may hold a lock for writing; this is exclusive and prevents any other thread from securing the lock for reading or for writing. The thread downgrades the lock by simultaneously releasing it for writing and acquiring it for reading; this allows other threads to also acquire the lock for reading. The reverse of this operation is an upgrade; if the thread holds the lock for reading and no other thread holds it as well, the thread simultaneously releases the lock for reading and acquires it for writing. The upgrade operation may require that the thread wait until other threads release their read-locks. The downgrade operation does not.
A thread may acquire the lock to some recursive depth; it must release the lock the same number of times to free it. This is useful in library code where a lock must be secured at each entry-point to the library; a thread will secure the lock once as it enters the library, and internal calls to the library entry-points simply re-secure the lock, and release it when returning from the call. The minimum, maximum, and average recursion depths of any thread holding this lock are reported in the table.
A thread holding a recursive write-lock is not allowed to downgrade it because the downgrade is intended to apply to only the last write-acquisition of the lock, and the prior acquisitions had a real reason to keep the acquisition exclusive. Instead, the lock is marked as being in the downgraded state, which is erased when the this latest acquisition is released or upgraded. A thread holding a recursive read-lock can only upgrade the latest acquisition of the lock, in which case the lock is marked as being upgraded. The thread will have to wait until the lock is released by any other threads holding it for reading. The minimum, maximum, and average recursion-depths of any thread holding this lock in an upgraded or downgraded state are reported in the table.
The Lock Activity report also breaks the time down based on what task the lock is being secured for (reading, writing, or upgrading).
No time is reported to perform a downgrade because this is performed without any contention. The upgrade state is only reported for the case where a recursive read-lock is upgraded. Otherwise, the thread activity is measured as releasing a read-lock and acquiring a write-lock.
The function and thread details also break down the acquisition, spin, and wait counts by whether the lock is to be acquired for reading or writing.
By default, splat prints a detailed report for each PThread entry in the summary report. The PThread synchronizers are of the following types; mutex, read/write lock, and condition-variable. The mutex and read/write lock are related to the AIX complex lock. You can view the similarities in the lock detail reports. The condition-variable differs significantly from a lock, and this is reflected in the report details.
The PThread library instrumentation does not provide names or classes of synchronizers, so the addresses are the only way we have to identify them. Under certain conditions, the instrumentation can capture the return addresses of the function call stack, and these addresses are used with the gennames output to identify the call chains when these synchronizers are created. The creation and deletion times of the synchronizer can sometimes be determined as well, along with the ID of the PThread that created them.
The PThread mutex is similar to an AIX simple lock in that only one thread can acquire the lock, and is like an AIX complex lock in that it can be held recursively.
In addition to the common header information and the [PThread MUTEX] identifier, this report lists the following lock details:
Acquisitions | The number of times that the lock was acquired in the analysis interval. |
Miss Rate | The percentage of attempts that failed to acquire the lock. |
Spin Count | The number of unsuccessful attempts to acquire the lock. |
Wait Count | The number of times that a thread was forced into a suspended wait state waiting for the lock to come available. |
Busy Count | The number of trylock calls that returned busy. |
Seconds Held | This field contains the following
sub-fields:
|
Percent Held | This field contains the following
sub-fields:
|
Depth | This field contains the following
sub-fields:
|
The columns are defined as follows:
PThreadID | The PThread identifier. |
Acquisitions | The number of times that this thread acquired the lock. |
Miss Rate | The percentage of acquisition attempts by the thread that failed to secure the lock. |
Spin Count | The number of unsuccessful attempts by this thread to secure the lock. |
Wait Count | The number of times that this thread was forced to wait until the lock came available. |
Busy Count | The number of times this thread used try to acquire the lock without success (calls to simple_lock_try that returned busy). |
Percent Held of Total Time | This field contains the following
sub-fields:
|
The PThread read/write lock is similar to an AIX complex lock in that it can be acquired for reading or writing; writing is exclusive in that a single thread can only acquire the lock for writing, and no other thread can hold the lock for reading or writing at that point. Reading is not exclusive, so more than one thread can hold the lock for reading. Reading is recursive in that a single thread can hold multiple read-acquisitions on the lock. Writing is not recursive.
In addition to the common header information and the [PThread RWLock] identifier, this report lists the following lock details:
Acquisitions | The number of times that the lock was acquired in the analysis interval. |
Miss Rate | The percentage of attempts that failed to acquire the lock. |
Spin Count | The number of unsuccessful attempts to acquire the lock. |
Wait Count | The current PThread implementation does not force threads to wait for read/write locks. This reports the number of times a thread, spinning on this lock, is undispatched. |
Seconds Held | This field contains the following
sub-fields:
|
Percent Held | This field contains the following
sub-fields:
|
Depth | This field contains the following
sub-fields:
|
PThreadID | The PThread identifier. |
Acquisitions | The number of times that this thread acquired the lock. |
Miss Rate | The percentage of acquisition attempts by the thread that failed to secure the lock. |
Spin Count | The number of unsuccessful attempts by this thread to secure the lock. |
Wait Count | The number of times this thread was forced to wait until the lock came available. |
Busy Count | The number of times this thread used try to acquire the lock without success (calls to simple_lock_try that returned busy). |
Percent Held of Total Time | This field contains the following
sub-fields:
|
The PThread condition-variable is a synchronizer, but not a lock. A PThread is suspended until a signal indicates that the condition now holds.
In addition to the common header information and the [PThread CondVar] identifier, this report lists the following details:
Passes | The number of times that the condition was signaled to hold during the analysis interval. |
Fail Rate | The percentage of times that the condition was tested and was not found to be true. |
Spin Count | The number of times that the condition was tested and was not found to be true. |
Wait Count | The number of times that a thread was forced into a suspended wait state waiting for the condition to be signaled. |
Spin / Wait Time | This field contains the following
sub-fields:
|
Depth | This field contains the following
sub-fields:
|
If the -dt or -da options are used, splat reports the thread detail as described below:
PThreadID | The PThread identifier. |
Passes | The number of times that this thread was notified that the condition passed. |
Fail Rate | The percentage of times the thread checked the condition and did not find it to be true. |
Spin Count | The number of times the thread checked the condition and did not find it to be true. |
Wait Count | The number of times this thread was forced to wait until the condition became true. |
Percent Total Time | This field contains the following
sub-fields:
|