The CPU Utilization Reporting Tool (curt) command converts an AIX trace file into a number of statistics related to CPU utilization and either process or thread activity. These statistics ease the tracking of specific application activity. curt works with both uniprocessor and multiprocessor AIX Version 4 and AIX Version 5 traces.
The syntax for the curt command is as follows:
curt -i inputfile [-o outputfile] [-n gennamesfile] [-m trcnmfile] [-a pidnamefile] [-f timestamp] [-l timestamp] [-ehpst]
gennamesfile | The names file as produced by gennames. |
inputfile | The AIX trace file to be processed by the curt command. |
outputfile | The name of the output file created by the curt command. |
pidnamefile | If the trace process name table is not accurate, or if more descriptive names are desired, use the -a flag to specify a PID to process name mapping file. This is a file with lines consisting of a process ID (in decimal) followed by a space, then an ASCII string to use as the name for that process. |
timestamp | The time in seconds at which to start and stop the trace file processing. |
trcnmfile | The names file as produced by trcnm. |
A raw (unformatted) system trace from AIX Version 4 or AIX Version 5 is read by the curt command to produce summaries, as well as first and second level interrupt handlers. This summary information is useful for determining which application, system call, or interrupt handler is using most of the CPU time and is a candidate to be optimized to improve system performance.
The following table lists the minimum trace hooks required for the curt command. Using only these trace hooks will limit the size of the trace file. However, other events on the system may not be captured in this case. This is significant if you intend to analyze the trace in more detail.
Hook ID | Event Name | Event Explanation |
---|---|---|
100 | HKWD_KERN_FLIH | Occurrence of a first level interrupt, such as an I/O interrupt, a data access page fault, or a timer interrupt (scheduler). |
101 | HKWD_KERN_SVC | A thread has issued a system call. |
102 | HKWD_KERN_SLIH | Occurrence of a second level interrupt, that is, first level I/O interrupts are being passed on to the second level interrupt handler which then is working directly with the device driver. |
103 | HKWD_KERN_SLIHRET | Return from a second level interrupt to the caller (usually a first level interrupt handler). |
104 | HKWD_KERN_SYSCRET | Return from a system call to the caller (usually a thread). |
106 | HKWD_KERN_DISPATCH | A thread has been dispatched from the run queue to a CPU. |
10C | HKWD_KERN_IDLE | The idle process has been dispatched. |
119 | HKWD_KERN_PIDSIG | A signal has been sent to a process. |
134 | HKWD_SYSC_EXECVE | An exec supervisor call (SVC) has been issued by a (forked) process. |
135 | HKWD_SYSC__EXIT | An exit supervisor call (SVC) has been issued by a process. |
139 | HKWD_SYSC_FORK | A fork SVC has been issued by a process. |
200 | HKWD_KERN_RESUME | A dispatched thread is being resumed on the CPU. |
210 | HKWD_KERN_INITP | A kernel process has been created. |
38F | HKWD_DR | A processor has been added/removed. |
465 | HKWD_SYSC_CRTHREAD | A thread_create SVC has been issued by a process. |
Trace hooks 119 and 135 are used to report on the time spent in the exit system call. Trace hooks 134, 139, 210, and 465 are used to keep track of TIDs, PIDs and process names.
Preparing curt output is a four-stage process. The trace file used in the example was generated using the following process:
# HOOKS="100,101,102,103,104,106,10C,119,134,135,139,200,210,38F,465" # SIZE="1000000" # export HOOKS SIZE # trace -n -C all -d -j $HOOKS -L $SIZE -T $SIZE -afo trace.raw # trcon ; sleep 5 ; trcstop # unset HOOKS SIZE # ls trace.raw* trace.raw trace.raw-0 trace.raw-1 trace.raw-2 trace.raw-3 # trcrpt -C all -r trace.raw > trace.r # rm trace.raw* # ls trace* trace.r # gennames > gennames.out # trcnm > trace.nm
The following is an overview of the reports that can be generated by the curt command.
To create additional, specialized reports, run the curt command using the following flags:
-e | Produces a report that includes statistics and additional information on the System Calls Summary Report. The additional information pertains to the total, average, maximum, and minimum elapsed times that a system call was running. |
-s | Produces a report that includes a list of errors returned by system calls. |
-t | Produces a report that includes a detailed report on thread status that includes the amount of CPU time the thread was in application and system call mode, what system calls the thread made, processor affinity, the number of times the thread was dispatched, and to which CPU(s) it was dispatched. The report also includes dispatch wait time and details of interrupts. |
-p | Produces a report that includes a detailed report on process status that includes the amount of CPU time the process was in application and system call mode, which threads were in the process, and what system calls the process made. |
This section explains the default report created by the curt command, as follows:
# curt -i trace.r -m trace.nm -n gennames.out -o curt.out
The curt command output always includes this default report in its output, even if one of the flags described in the previous section is used.
The report is divided into the following sections:
The first information in the report is the time and date when this particular curt command was run, including the syntax of the curt command line that produced the report.
The General Information section also contains some information about the AIX trace file that was processed by curt. This information consists of the trace file's name, size, and its creation date. The command used to invoke the AIX trace facility and gather the trace file is displayed at the end of the report.
The following is a sample of this output:
Run on Fri May 25 11:08:46 2001 Command line was: curt -i trace.r -m trace.nm -n gennames.out -o curt.out ---- AIX trace file name = trace.r AIX trace file size = 1632496 AIX trace file created = Fri May 25 11:04:33 2001 Command used to gather AIX trace was: trace -n -C all -d -j 100,101,102,103,104,106,10C,134,139,200,465 -L 1000000 -T 1000000 -afo trace.raw
The next part of the default report is the System Summary produced by the curt command. The following is a sample:
System Summary -------------- processing percent percent total time total time busy time (msec) (incl. idle) (excl. idle) processing category =========== =========== =========== =================== 14998.65 73.46 92.98 APPLICATION 591.59 2.90 3.66 SYSCALL 48.33 0.24 0.30 KPROC 486.19 2.38 3.00 FLIH 49.10 0.24 0.30 SLIH 8.83 0.04 0.05 DISPATCH (all procs. incl. IDLE) 1.04 0.01 0.01 IDLE DISPATCH (only IDLE proc.) ----------- ---------- ------- 16182.69 79.26 100.00 CPU(s) busy time 4234.76 20.74 IDLE ----------- ---------- 20417.45 TOTAL Avg. Thread Affinity = 0.99
This portion of the report describes the time spent by the system as a whole (all CPUs) in various execution modes.
The System Summary has the following fields:
processing total time | Total time in milliseconds for the corresponding processing category. |
percent total time | Time from the first column as a percentage of the sum of total trace elapsed time for all processors. This includes whatever amount of time each processor spent running the IDLE process. |
percent busy time | Time from the first column as a percentage of the sum of total trace elapsed time for all processors without including the time each processor spent executing the IDLE process. |
Avg. Thread Affinity | Probability that a thread was dispatched to the same processor on which it last executed. |
The possible execution modes or processing categories are interpreted as follows:
APPLICATION | The sum of times spent by all processors in User (that is, non-privileged) mode. |
SYSCALL | The sum of times spent by all processors doing System Calls. This is the portion of time that a processor spends executing in the kernel code providing services directly requested by a user process. |
KPROC | The sum of times spent by all processors executing kernel processes other than the IDLE process. This is the portion of time that a processor spends executing specially created dispatchable processes that only execute kernel code. |
FLIH | The sum of times spent by all processors executing FLIHs (First Level Interrupt Handlers). |
SLIH | The sum of times spent by all processors executing SLIHs (Second Level Interrupt Handlers). |
DISPATCH | The sum of times spent by all processors executing the AIX dispatch code. This sum includes the time spent dispatching all threads (that is, it includes dispatches of the IDLE process). |
IDLE DISPATCH | The sum of times spent by all processors executing the AIX dispatch code where the process being dispatched was the IDLE process. Because the DISPATCH category includes the IDLE DISPATCH category's time, the IDLE DISPATCH category's time is not separately added to calculate either CPU(s) busy time or TOTAL (see below). |
CPU(s) busy time | The sum of times spent by all processors executing in APPLICATION, SYSCALL, KPROC, FLIH, SLIH, and DISPATCH modes. |
IDLE | The sum of times spent by all processors executing the IDLE process. |
TOTAL | The sum of CPU(s) busy time and IDLE. |
The System Summary example indicates that the CPU is spending most of its time in application mode. We still have 4234.76 ms of IDLE time so we have enough CPU to run our applications. If there was insufficient CPU power, we would not expect to see any IDLE time. The Avg. Thread Affinity value is 0.99 showing good processor affinity; that is, threads returning to the same processor when they are ready to be rerun.
This part of the curt output follows the System Summary and is essentially the same information but presented on a processor-by-processor basis. The same description that was given for the System Summary applies here, except that this report covers only one processor.
Below is a sample of this output:
Processor Summary processor number 0 --------------------------------------- processing percent percent total time total time busy time (msec) (incl. idle) (excl. idle) processing category =========== =========== =========== =================== 45.07 0.88 5.16 APPLICATION 591.39 11.58 67.71 SYSCALL 47.83 0.94 5.48 KPROC 173.78 3.40 19.90 FLIH 9.27 0.18 1.06 SLIH 6.07 0.12 0.70 DISPATCH (all procs. incl. IDLE) 1.04 0.02 0.12 IDLE DISPATCH (only IDLE proc.) ----------- ---------- ------- 873.42 17.10 100.00 CPU(s) busy time 4232.92 82.90 IDLE ----------- ---------- 5106.34 TOTAL Avg. Thread Affinity = 0.98 Total number of process dispatches = 1620 Total number of idle dispatches = 782 Processor Summary processor number 1 --------------------------------------- processing percent percent total time total time busy time (msec) (incl. idle) (excl. idle) processing category =========== =========== =========== =================== 4985.81 97.70 97.70 APPLICATION 0.09 0.00 0.00 SYSCALL 0.00 0.00 0.00 KPROC 103.86 2.04 2.04 FLIH 12.54 0.25 0.25 SLIH 0.97 0.02 0.02 DISPATCH (all procs. incl. IDLE) 0.00 0.00 0.00 IDLE DISPATCH (only IDLE proc.) ----------- ---------- ------- 5103.26 100.00 100.00 CPU(s) busy time 0.00 0.00 IDLE ----------- ---------- 5103.26 TOTAL Avg. Thread Affinity = 0.99 Total number of process dispatches = 516 Total number of idle dispatches = 0 ...(lines omitted)...
The Total number of process dispatches refers to how many times AIX dispatched any non-IDLE process on this processor. The Total number of idle dispatches gives the count of IDLE process dispatches.
The Application Summary (by Tid) shows an output of all the threads that were running on the system during the time of trace collection and their CPU consumption. The thread that consumed the most CPU time during the time of the trace collection is at the top of the list.
Application Summary (by Tid) ---------------------------- -- processing total (msec) -- -- percent of total processing time -- combined application syscall combined application syscall name (Pid Tid) ======== =========== ======= ======== =========== ======= =================== 4986.2355 4986.2355 0.0000 24.4214 24.4214 0.0000 cpu(18418 32437) 4985.8051 4985.8051 0.0000 24.4193 24.4193 0.0000 cpu(19128 33557) 4982.0331 4982.0331 0.0000 24.4009 24.4009 0.0000 cpu(18894 28671) 83.8436 2.5062 81.3374 0.4106 0.0123 0.3984 disp+work(20390 28397) 72.5809 2.7269 69.8540 0.3555 0.0134 0.3421 disp+work(18584 32777) 69.8023 2.5351 67.2672 0.3419 0.0124 0.3295 disp+work(19916 33033) 63.6399 2.5032 61.1368 0.3117 0.0123 0.2994 disp+work(17580 30199) 63.5906 2.2187 61.3719 0.3115 0.0109 0.3006 disp+work(20154 34321) 62.1134 3.3125 58.8009 0.3042 0.0162 0.2880 disp+work(21424 31493) 60.0789 2.0590 58.0199 0.2943 0.0101 0.2842 disp+work(21992 32539) ...(lines omitted)...
The output is divided into two main sections, of which one shows the total processing time of the thread in milliseconds (processing total (msec)), and the other shows the CPU time the thread has consumed, expressed as a percentage of the total CPU time (percent of total processing time).
The Application Summary (by Tid) has the following fields:
name (Pid Tid) | The name of the process associated with the thread, its process id, and its thread id. |
processing total (msec)
combined | The total amount of CPU time, expressed in milliseconds, that the thread was running in either application mode or system call mode. |
application | The amount of CPU time, expressed in milliseconds, that the thread spent in application mode. |
syscall | The amount of CPU time, expressed in milliseconds, that the thread spent in system call mode. |
percent of total processing time
combined | The amount of CPU time that the thread was running, expressed as percentage of the total processing time. |
application | The amount of CPU time that the thread the thread spent in application mode, expressed as percentage of the total processing time. |
syscall | The amount of CPU time that the thread spent in system call mode, expressed as percentage of the total processing time. |
In this example, we can investigate why the system is spending so much time in application mode by looking at the Application Summary (by Tid), where we can see the top three processes of the report are named cpu, a test program that uses a great deal of CPU time. The report shows again that the CPU spent most of its time in application mode running the "cpu" process. Therefore the "cpu" process is a candidate to be optimized to improve system performance.
The Application Summary (by Pid) has the same content as the Application Summary (by Tid), except that the threads that belong to each process are consolidated and the process that consumed the most CPU time during the monitoring period is at the beginning of the list.
The column name (PID) (Thread Count) shows the process name, its process ID, and the number of threads that belong to this process and that have been accumulated for this line of data.
Application Summary (by Pid) ---------------------------- -- processing total (msec) -- -- percent of total processing time -- combined application syscall combined application syscall name (Pid)(Thread Count) ======== =========== ======= ======== =========== ======= =================== 4986.2355 4986.2355 0.0000 24.4214 24.4214 0.0000 cpu(18418)(1) 4985.8051 4985.8051 0.0000 24.4193 24.4193 0.0000 cpu(19128)(1) 4982.0331 4982.0331 0.0000 24.4009 24.4009 0.0000 cpu(18894)(1) 83.8436 2.5062 81.3374 0.4106 0.0123 0.3984 disp+work(20390)(1) 72.5809 2.7269 69.8540 0.3555 0.0134 0.3421 disp+work(18584)(1) 69.8023 2.5351 67.2672 0.3419 0.0124 0.3295 disp+work(19916)(1) 63.6399 2.5032 61.1368 0.3117 0.0123 0.2994 disp+work(17580)(1) 63.5906 2.2187 61.3719 0.3115 0.0109 0.3006 disp+work(20154)(1) 62.1134 3.3125 58.8009 0.3042 0.0162 0.2880 disp+work(21424)(1) 60.0789 2.0590 58.0199 0.2943 0.0101 0.2842 disp+work(21992)(1) ...(lines omitted)...
The Application Summary (by process type) consolidates all processes of the same name and sorts them in descending order of combined processing time.
The name (thread count) column shows the name of the process, and the number of threads that belong to this process name (type) and were running on the system during the monitoring period.
Application Summary (by process type) ----------------------------------------------- -- processing total (msec) -- -- percent of total processing time -- combined application syscall combined application syscall name (thread count) ======== =========== ======= ======== =========== ======= ================== 14954.0738 14954.0738 0.0000 73.2416 73.2416 0.0000 cpu(3) 573.9466 21.2609 552.6857 2.8111 0.1041 2.7069 disp+work(9) 20.9568 5.5820 15.3748 0.1026 0.0273 0.0753 trcstop(1) 10.6151 2.4241 8.1909 0.0520 0.0119 0.0401 i4llmd(1) 8.7146 5.3062 3.4084 0.0427 0.0260 0.0167 dtgreet(1) 7.6063 1.4893 6.1171 0.0373 0.0073 0.0300 sleep(1) ...(lines omitted)...
The Kproc Summary (by Tid) shows an output of all the kernel process threads that were running on the system during the time of trace collection and their CPU consumption. The thread that consumed the most CPU time during the time of the trace collection is at the beginning of the list.
Kproc Summary (by Tid) ----------------------- -- processing total (msec) -- -- percent of total time -- combined operation kernel combined operation kernel name (Pid Tid Type) ======== ========= ====== ======== ========= ====== =================== 4232.9216 0.0000 4232.9216 20.7319 0.0000 20.7319 wait(516 517 W) 30.4374 0.0000 30.4374 0.1491 0.0000 0.1491 lrud(1548 1549 -) ...(lines omitted)... Kproc Types ----------- Type Function Operation ==== ============================ ========================== W idle thread -
The Kproc Summary has the following fields:
name (Pid Tid Type) | The name of the kernel process associated with the thread, its process ID, its thread ID, and its type. The kproc type is defined in the Kproc Types listing following the Kproc Summary. |
processing total (msec)
combined | The total amount of CPU time, expressed in milliseconds, that the thread was running in either operation or kernel mode. |
operation | The amount of CPU time, expressed in milliseconds, that the thread spent in operation mode. |
kernel | The amount of CPU time, expressed in milliseconds, that the thread spent in kernel mode. |
percent of total time
combined | The amount of CPU time that the thread was running, expressed as percentage of the total processing time. |
operation | The amount of CPU time that the thread the thread spent in operation mode, expressed as percentage of the total processing time. |
kernel | The amount of CPU time that the thread spent in kernel mode, expressed as percentage of the total processing time. |
Kproc Types
Type | A single letter to be used as an index into this listing. |
Function | A description of the nominal function of this type of kernel process. |
Operation | A description of the operation mode for this type of kernel process. |
The System Calls Summary provides a list of all the system calls that have completed execution on the system during the monitoring period. The list is sorted by the total CPU time in milliseconds consumed by each type of system call.
System Calls Summary -------------------- Count Total Time % sys Avg Time Min Time Max Time SVC (Address) (msec) time (msec) (msec) (msec) ======== =========== ====== ======== ======== ======== ================ 605 355.4475 1.74% 0.5875 0.0482 4.5626 kwrite(4259c4) 733 196.3752 0.96% 0.2679 0.0042 2.9948 kread(4259e8) 3 9.2217 0.05% 3.0739 2.8888 3.3418 execve(1c95d8) 38 7.6013 0.04% 0.2000 0.0051 1.6137 __loadx(1c9608) 1244 4.4574 0.02% 0.0036 0.0010 0.0143 lseek(425a60) 45 4.3917 0.02% 0.0976 0.0248 0.1810 access(507860) 63 3.3929 0.02% 0.0539 0.0294 0.0719 _select(4e0ee4) 2 2.6761 0.01% 1.3380 1.3338 1.3423 kfork(1c95c8) 207 2.3958 0.01% 0.0116 0.0030 0.1135 _poll(4e0ecc) 228 1.1583 0.01% 0.0051 0.0011 0.2436 kioctl(4e07ac) 9 0.8136 0.00% 0.0904 0.0842 0.0988 .smtcheckinit(1b245a8) 5 0.5437 0.00% 0.1087 0.0696 0.1777 open(4e08d8) 15 0.3553 0.00% 0.0237 0.0120 0.0322 .smtcheckinit(1b245cc) 2 0.2692 0.00% 0.1346 0.1339 0.1353 statx(4e0950) 33 0.2350 0.00% 0.0071 0.0009 0.0210 _sigaction(1cada4) 1 0.1999 0.00% 0.1999 0.1999 0.1999 kwaitpid(1cab64) 102 0.1954 0.00% 0.0019 0.0013 0.0178 klseek(425a48) ...(lines omitted)...
The System Calls Summary has the following fields:
Count | The number of times that a system call of a certain type (see SVC (Address)) has been used (called) during the monitoring period. |
Total Time (msec) | The total CPU time that the system spent processing these system calls, expressed in milliseconds. |
% sys time | The total CPU time that the system spent processing these system calls, expressed as a percentage of the total processing time. |
Avg Time (msec) | The average CPU time that the system spent processing one system call of this type, expressed in milliseconds. |
Min Time (msec) | The minimum CPU time that the system needed to process one system call of this type, expressed in milliseconds. |
Max Time (msec) | The maximum CPU time that the system needed to process one system call of this type, expressed in milliseconds. |
SVC (Address) | The name of the system call and its kernel address. |
The Pending System Calls Summary provides a list of all the system calls that have been executed on the system during the monitoring period but have not completed. The list is sorted by TID.
Pending System Calls Summary ---------------------------- Accumulated SVC (Address) Procname (Pid Tid) Time (msec) ============ ========================= ========================== 0.0656 _select(4e0ee4) sendmail(7844 5001) 0.0452 _select(4e0ee4) syslogd(7514 8591) 0.0712 _select(4e0ee4) snmpd(5426 9293) 0.0156 kioctl(4e07ac) trcstop(47210 18379) 0.0274 kwaitpid(1cab64) ksh(20276 44359) 0.0567 kread4259e8) ksh(23342 50873) ...(lines omitted)...
The Pending System Calls Summary has the following fields:
Accumulated Time(msec) | The accumulated CPU time that the system spent processing the pending system call, expressed in milliseconds. |
SVC (Address) | The name of the system call and its kernel address. |
Procname (Pid Tid) | The name of the process associated with the thread which made the system call, its process ID, and the thread ID. |
The FLIH (First Level Interrupt Handler) Summary lists all first level interrupt handlers that were called during the monitoring period.
The Global Flih Summary lists the total of first level interrupts on the system, while the Per CPU Flih Summary lists the first level interrupts per CPU.
Global Flih Summary ------------------- Count Total Time Avg Time Min Time Max Time Flih Type (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ========= 2183 203.5524 0.0932 0.0041 0.4576 31(DECR_INTR) 946 102.4195 0.1083 0.0063 0.6590 3(DATA_ACC_PG_FLT) 12 1.6720 0.1393 0.0828 0.3366 32(QUEUED_INTR) 1058 183.6655 0.1736 0.0039 0.7001 5(IO_INTR) Per CPU Flih Summary -------------------- CPU Number 0: Count Total Time Avg Time Min Time Max Time Flih Type (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ========= 635 39.8413 0.0627 0.0041 0.4576 31(DECR_INTR) 936 101.4960 0.1084 0.0063 0.6590 3(DATA_ACC_PG_FLT) 9 1.3946 0.1550 0.0851 0.3366 32(QUEUED_INTR) 266 33.4247 0.1257 0.0039 0.4319 5(IO_INTR) CPU Number 1: Count Total Time Avg Time Min Time Max Time Flih Type (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ========= 4 0.2405 0.0601 0.0517 0.0735 3(DATA_ACC_PG_FLT) 258 49.2098 0.1907 0.0060 0.5076 5(IO_INTR) 515 55.3714 0.1075 0.0080 0.3696 31(DECR_INTR) Pending Flih Summary -------------------- Accumulated Time (msec) Flih Type ======================== ================ 0.0123 5(IO_INTR) ...(lines omitted)...
The FLIH Summary report has the following fields:
Count | The number of times that a first level interrupt of a certain type (see Flih Type) occurred during the monitoring period. |
Total Time (msec) | The total CPU time that the system spent processing these first level interrupts, expressed in milliseconds. |
Avg Time (msec) | The average CPU time that the system spent processing one first level interrupt of this type, expressed in milliseconds. |
Min Time (msec) | The minimum CPU time the system needed to process one first level interrupt of this type, expressed in milliseconds. |
Max Time (msec) | The maximum CPU time that the system needed to process one first level interrupt of this type, expressed in milliseconds. |
Flih Type | The number and name of the first level interrupt. |
Accumulated Time (msec) | The accumulated CPU time that the system spent processing the pending first level interrupt, expressed in milliseconds. |
The following are FLIH types that were depicted in the above example:
DATA_ACC_PG_FLT | Data access page fault |
QUEUED_INTR | Queued interrupt |
DECR_INTR | Decrementer interrupt |
IO_INTR | I/O interrupt |
The Second level interrupt handler (SLIH) Summary lists all second level interrupt handlers that were called during the monitoring period.
The Global Slih Summary lists the total of second level interrupts on the system, while the Per CPU Slih Summary lists the second level interrupts per CPU.
Global Slih Summary ------------------- Count Total Time Avg Time Min Time Max Time Slih Name(Address) (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ================= 43 7.0434 0.1638 0.0284 0.3763 s_scsiddpin(1a99104) 1015 42.0601 0.0414 0.0096 0.0913 ssapin(1990490) Per CPU Slih Summary -------------------- CPU Number 0: Count Total Time Avg Time Min Time Max Time Slih Name(Address) (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ================= 8 1.3500 0.1688 0.0289 0.3087 s_scsiddpin(1a99104) 258 7.9232 0.0307 0.0096 0.0733 ssapin(1990490) CPU Number 1: Count Total Time Avg Time Min Time Max Time Slih Name(Address) (msec) (msec) (msec) (msec) ====== =========== =========== =========== =========== ================= 10 1.2685 0.1268 0.0579 0.2818 s_scsiddpin(1a99104) 248 11.2759 0.0455 0.0138 0.0641 ssapin(1990490) ...(lines omitted)...
The SLIH Summary report has the following fields:
Count | The number of times that each second level interrupt handler was called during the monitoring period. |
Total Time (msec) | The total CPU time that the system spent processing these second level interrupts, expressed in milliseconds. |
Avg Time (msec) | The average CPU time that the system spent processing one second level interrupt of this type, expressed in milliseconds. |
Min Time (msec) | The minimum CPU time that the system needed to process one second level interrupt of this type, expressed in milliseconds. |
Max Time (msec) | The maximum CPU time that the system needed to process one second level interrupt of this type, expressed in milliseconds. |
Slih Name (Address) | The module name and kernel address of the second level interrupt. |
The report generated with the -e flag includes the data shown in the default report, and also includes additional information in the System Calls Summary and the Pending System Calls Summary. The additional information in the System Calls Summary includes the total, average, maximum, and minimum elapsed time that a system call was running. The additional information in the Pending System Calls Summary is the accumulated elapsed time for the pending system calls. The following is an example of the additional information reported by using the -e flag:
# curt -e -i trace.r -m trace.nm -n gennames.out -o curt.out # cat curt.out ...(lines omitted)... System Calls Summary -------------------- Count Total % sys Avg Min Max Tot Avg Min Max SVC (Address) Time time Time Time Time ETime ETime ETime ETime (msec) (msec) (msec) (msec) (msec) (msec) (msec) (msec) ===== ======== ===== ====== ====== ====== ========== ========= ========= ========= ====================== 605 355.4475 1.74% 0.5875 0.0482 4.5626 31172.7658 51.5252 0.0482 422.2323 kwrite(4259c4) 733 196.3752 0.96% 0.2679 0.0042 2.9948 12967.9407 17.6916 0.0042 265.1204 kread(4259e8) 3 9.2217 0.05% 3.0739 2.8888 3.3418 57.2051 19.0684 4.5475 40.0557 execve(1c95d8) 38 7.6013 0.04% 0.2000 0.0051 1.6137 12.5002 0.3290 0.0051 3.3120 __loadx(1c9608) 1244 4.4574 0.02% 0.0036 0.0010 0.0143 4.4574 0.0036 0.0010 0.0143 lseek(425a60) 45 4.3917 0.02% 0.0976 0.0248 0.1810 4.6636 0.1036 0.0248 0.3037 access(507860) 63 3.3929 0.02% 0.0539 0.0294 0.0719 5006.0887 79.4617 0.0294 100.4802 _select(4e0ee4) 2 2.6761 0.01% 1.3380 1.3338 1.3423 45.5026 22.7513 7.5745 37.9281 kfork(1c95c8) 207 2.3958 0.01% 0.0116 0.0030 0.1135 4494.9249 21.7146 0.0030 499.1363 _poll(4e0ecc) 228 1.1583 0.01% 0.0051 0.0011 0.2436 1.1583 0.0051 0.0011 0.2436 kioctl(4e07ac) 9 0.8136 0.00% 0.0904 0.0842 0.0988 4498.7472 499.8608 499.8052 499.8898 .smtcheckinit(1b245a8) 5 0.5437 0.00% 0.1087 0.0696 0.1777 0.5437 0.1087 0.0696 0.1777 open(4e08d8) 15 0.3553 0.00% 0.0237 0.0120 0.0322 0.3553 0.0237 0.0120 0.0322 .smtcheckinit(1b245cc) 2 0.2692 0.00% 0.1346 0.1339 0.1353 0.2692 0.1346 0.1339 0.1353 statx(4e0950) 33 0.2350 0.00% 0.0071 0.0009 0.0210 0.2350 0.0071 0.0009 0.0210 _sigaction(1cada4) 1 0.1999 0.00% 0.1999 0.1999 0.1999 5019.0588 5019.0588 5019.0588 5019.0588 kwaitpid(1cab64) 102 0.1954 0.00% 0.0019 0.0013 0.0178 0.5427 0.0053 0.0013 0.3650 klseek(425a48) ...(lines omitted)... Pending System Calls Summary ---------------------------- Accumulated Accumulated SVC (Address) Procname (Pid Tid) Time (msec) ETime (msec) ============ ============ ========================= ========================= 0.0855 93.6498 kread(4259e8) oracle(143984 48841) ...(lines omitted)...
The System Calls Summary in the preceding example has the following fields in addition to the default System Calls Summary:
Tot ETime (msec) | The total amount of time from when each instance of the system call was started until it completed. This time will include any time spent servicing interrupts, running other processes, and so forth. |
Avg ETime (msec) | The average amount of time from when the system call was started until it completed. This time will include any time spent servicing interrupts, running other processes, and so forth. |
Min ETime (msec) | The minimum amount of time from when the system call was started until it completed. This time will include any time spent servicing interrupts, running other processes, and so forth. |
Max ETime (msec) | The maximum amount of time from when the system call was started until it completed. This time will include any time spent servicing interrupts, running other processes, and so forth. |
Accumulated ETime (msec) | The total amount of time from when the pending system call was started until the end of the trace. This time will include any time spent servicing interrupts, running other processes, and so forth. |
The preceding example report shows that the maximum elapsed time for the kwrite system call was 422.2323 msec, but the maximum CPU time was 4.5626 msec. If this amount of overhead time is unusual for the device being written to, further analysis is needed.
The report generated with the -s flag includes the data shown in the default report, plus it includes data on errors returned by system calls as shown by the following:
# curt -s -i trace.r -m trace.nm -n gennames.out -o curt.out # cat curt.out ...(lines omitted)... Errors Returned by System Calls ------------------------------ Errors (errno : count : description) returned for System Call: kioctl(4e07ac) 25 : 15 : "Not a typewriter" Errors (errno : count : description) returned for System Call: execve(1c95d8) 2 : 2 : "No such file or directory" ...(lines omitted)...
If a large number of errors of a specific type or on a specific system call point to a system or application problem, other debug measures can be used to determine and fix the problem.
The report generated with the -t flag includes the data shown in the default report, and also includes a detailed report on thread status that includes the amount of time the thread was in application and system call mode, what system calls the thread made, processor affinity, the number of times the thread was dispatched, and to which CPU(s) it was dispatched. The report also includes dispatch wait time and details of interrupts:
...(lines omitted)... -------------------------------------------------------------------------------- Report for Thread Id: 48841 (hex bec9) Pid: 143984 (hex 23270) Process Name: oracle --------------------- Total Application Time (ms): 70.324465 Total System Call Time (ms): 53.014910 Thread System Call Data Count Total Time Avg Time Min Time Max Time SVC (Address) (msec) (msec) (msec) (msec) ======== =========== =========== =========== =========== ================ 69 34.0819 0.4939 0.1666 1.2762 kwrite(169ff8) 77 12.0026 0.1559 0.0474 0.2889 kread(16a01c) 510 4.9743 0.0098 0.0029 0.0467 times(f1e14) 73 1.2045 0.0165 0.0105 0.0306 select(1d1704) 68 0.6000 0.0088 0.0023 0.0445 lseek(16a094) 12 0.1516 0.0126 0.0071 0.0241 getrusage(f1be0) No Errors Returned by System Calls Pending System Calls Summary ---------------------------- Accumulated SVC (Address) Time (msec) ============ ========================== 0.1420 kread(16a01c) processor affinity: 0.583333 Dispatch Histogram for thread (CPUid : times_dispatched). CPU 0 : 23 CPU 1 : 23 CPU 2 : 9 CPU 3 : 9 CPU 4 : 8 CPU 5 : 14 CPU 6 : 17 CPU 7 : 19 CPU 8 : 1 CPU 9 : 4 CPU 10 : 1 CPU 11 : 4 total number of dispatches: 131 total number of redispatches due to interupts being disabled: 1 avg. dispatch wait time (ms): 8.273515 Data on Interrupts that Occured while Thread was Running Type of Interrupt Count =============================== ============================ Data Access Page Faults (DSI): 115 Instr. Fetch Page Faults (ISI): 0 Align. Error Interrupts: 0 IO (external) Interrupts: 0 Program Check Interrupts: 0 FP Unavailable Interrupts: 0 FP Imprecise Interrupts: 0 RunMode Interrupts: 0 Decrementer Interrupts: 18 Queued (Soft level) Interrupts: 15 ...(lines omitted)...
The information in the threads summary includes the following:
Thread ID | The Thread ID of the thread. |
Process ID | The Process ID that the thread belongs to. |
Process Name | The process name, if known, that the thread belongs to. |
Total Application Time (ms) | The amount of time, expressed in milliseconds, that the thread spent in application mode. |
Total System Call Time (ms) | The amount of time, expressed in milliseconds, that the thread spent in system call mode. |
Thread System Call Data | A system call summary for the thread; this has the same fields as the global System Call Summary. It also includes elapsed time if the -e flag is specified and error information if the -s flag is specified. |
Pending System Calls Summary | If the thread was executing a system call at the end of the trace, a pending system call summary will be printed. This has the Accumulated Time and Supervisor Call (SVC Address) fields. It also includes elapsed time if the -e flag is specified. |
processor affinity | The process affinity, which is the probability that, for any dispatch of the thread, the thread was dispatched to the same processor on which it last executed. |
Dispatch Histogram for thread | Shows the number of times the thread was dispatched to each CPU in the system. |
total number of dispatches | The total number of times the thread was dispatched (not including redispatches). |
total number of redispatches due to interrupts being disabled | The number of redispatches due to interrupts being disabled, which is when the dispatch code is forced to dispatch the same thread that is currently running on that particular CPU because the thread had disabled some interrupts. This total is only reported if the value is non-zero. |
avg. dispatch wait time (ms) | The average dispatch wait time is the average elapsed time for the thread from being undispatched and its next dispatch. |
Data on Interrupts that occurred while Thread was Running | Count of how many times each type of FLIH occurred while this thread was executing. |
When a report is generated using the -p flag, it gives detailed information about each process found in the trace. The following example shows the report generated for the router process (PID 129190).
...(lines omitted)... Process Details for Pid: 129190 Process Name: router 7 Tids for this Pid: 245889 245631 244599 82843 78701 75347 28941 Total Application Time (ms): 124.023749 Total System Call Time (ms): 8.948695 Process System Call Data Count Total Time % sys Avg Time Min Time Max Time SVC (Address) (msec) time (msec) (msec) (msec) ======== =========== ====== ======== ======== ======== ================ 93 3.6829 0.05% 0.0396 0.0060 0.3077 kread(19731c) 23 2.2395 0.03% 0.0974 0.0090 0.4537 kwrite(1972f8) 30 0.8885 0.01% 0.0296 0.0073 0.0460 select(208c5c) 1 0.5933 0.01% 0.5933 0.5933 0.5933 fsync(1972a4) 106 0.4902 0.01% 0.0046 0.0035 0.0105 klseek(19737c) 13 0.3285 0.00% 0.0253 0.0130 0.0387 semctl(2089e0) 6 0.2513 0.00% 0.0419 0.0238 0.0650 semop(2089c8) 3 0.1223 0.00% 0.0408 0.0127 0.0730 statx(2086d4) 1 0.0793 0.00% 0.0793 0.0793 0.0793 send(11e1ec) 9 0.0679 0.00% 0.0075 0.0053 0.0147 fstatx(2086c8) 4 0.0524 0.00% 0.0131 0.0023 0.0348 kfcntl(22aa14) 5 0.0448 0.00% 0.0090 0.0086 0.0096 yield(11dbec) 3 0.0444 0.00% 0.0148 0.0049 0.0219 recv(11e1b0) 1 0.0355 0.00% 0.0355 0.0355 0.0355 open(208674) 1 0.0281 0.00% 0.0281 0.0281 0.0281 close(19728c) Pending System Calls Summary ---------------------------- Accumulated SVC (Address) Tid Time (msec) ============ ========================= ================ 0.0452 select(208c5c) 245889 0.0425 select(208c5c) 78701 0.0285 select(208c5c) 82843 0.0284 select(208c5c) 245631 0.0274 select(208c5c) 244599 0.0179 select(208c5c) 75347 ...(lines omitted)...
The -p flag process information includes the Process ID and name, and a count and list of the thread IDs belonging to the process. The total application and system call time for all the threads of the process is given. Lastly, it includes summary reports of all the completed and pending system calls for the threads of the process.