[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]

Performance Management Guide

SMP Tools

All performance tools of the operating system support SMP machines. Some performance tools provide individual processor utilization statistics. Other performance tools average out the utilization statistics for all processors and display only the averages.

This section describes the tools that are only supported on SMP. For details on all other performance tools, see the appropriate chapters.

The bindprocessor Command

Use the bindprocessor command to bind or unbind the kernel threads of a process to a processor. Root authority is necessary to bind or unbind threads in processes that you do not own.

Note
The bindprocessor command is meant for multiprocessor systems. Although it will also work on uniprocessor systems, binding has no effect on such systems.

To query the available processors, run the following:

# bindprocessor -q
The available processors are:  0 1 2 3

The output shows the logical processor numbers for the available processors, which are used with the bindprocessor command as will be seen.

To bind a process whose PID is 14596 to processor 1, run the following:

# bindprocessor 14596 1

No return message is given if the command was successful. To verify if a process is bound or unbound to a processor, use the ps -mo THREAD command as explained in Using the ps Command:

# ps -mo THREAD
USER   PID  PPID    TID ST  CP PRI SC    WCHAN        F     TT BND COMMAND
root  3292  7130      - A    1  60  1        -   240001  pts/0   - -ksh
   -     -     -  14309 S    1  60  1        -      400      -   - -
root 14596  3292      - A   73 100  1        -   200001  pts/0   1 /tmp/cpubound
   -     -     -  15629 R   73 100  1        -        0      -   1 -
root 15606  3292      - A   74 101  1        -   200001  pts/0   - /tmp/cpubound
   -     -     -  16895 R   74 101  1        -        0      -   - -
root 16634  3292      - A   73 100  1        -   200001  pts/0   - /tmp/cpubound
   -     -     -  15107 R   73 100  1        -        0      -   - -
root 18048  3292      - A   14  67  1        -   200001  pts/0   - ps -mo THREAD
   -     -     -  17801 R   14  67  1        -        0      -   - -

The column BND shows the number of the processor that the process is bound to or a dash (-) if the process is not bound at all.

To unbind a process whose PID is 14596, use the following command:

# bindprocessor -u 14596
# ps -mo THREAD
USER   PID  PPID    TID ST  CP PRI SC    WCHAN        F     TT BND COMMAND
root  3292  7130      - A    2  61  1        -   240001  pts/0   - -ksh
   -     -     -  14309 S    2  61  1        -      400      -   - -
root 14596  3292      - A  120 124  1        -   200001  pts/0   - /tmp/cpubound
   -     -     -  15629 R  120 124  1        -        0      -   - -
root 15606  3292      - A  120 124  1        -   200001  pts/0   - /tmp/cpubound
   -     -     -  16895 R  120 124  1        -        0      -   - -
root 16634  3292      - A  120 124  0        -   200001  pts/0   - /tmp/cpubound
   -     -     -  15107 R  120 124  0        -        0      -   - -
root 18052  3292      - A   12  66  1        -   200001  pts/0   - ps -mo THREAD
   -     -     -  17805 R   12  66  1        -        0      -   - -

When the bindprocessor command is used on a process, all of its threads will then be bound to one processor and unbound from their former processor. Unbinding the process will also unbind all its threads. You cannot bind or unbind an individual thread using the bindprocessor command.

However, within a program, you can use the bindprocessor() function call to bind individual threads. If the bindprocessor() function is used within a piece of code to bind threads to processors, the threads remain with these processors and cannot be unbound. If the bindprocessor command is used on that process, all of its threads will then be bound to one processor and unbound from their respective former processors. An unbinding of the whole process will also unbind all the threads.

A process cannot be bound until it is started; that is, it must exist in order to be bound. When a process does not exist, the following error displays:

# bindprocessor 7359 1
1730-002: Process 7359 does not match an existing process

When a processor does not exist, the following error displays:

# bindprocessor 7358 4
1730-001: Processor 4 is not available
Note
Do not use the bindprocessor command on the wait processes kproc.

Considerations

Binding can be useful for CPU-intensive programs that experience few interrupts. It can sometimes be counterproductive for ordinary programs because it may delay the redispatch of a thread after an I/O until the processor to which the thread is bound becomes available. If the thread has been blocked for the duration of an I/O operation, it is unlikely that much of its processing context remains in the caches of the processor to which it is bound. The thread would probably be better served if it were dispatched to the next available processor.

Binding does not prevent other processes from being dispatched on the processor on which you bound your process. Binding is different from partitioning. Without Workload Manager (WLM), introduced in AIX 4.3.3, it is not possible to dedicate a set of processors to a specific workload and another set of processors to another workload. Therefore, a higher priority process might be dispatched on the processor where you bound your process. In this case, your process will not be dispatched on other processors, and therefore, you will not always increase the performance of the bound process. Better results may be achieved if you increase the priority of the bound process.

If you bind a process on a heavily loaded system, you might decrease its performance because when a processor becomes idle, the process will not be able to run on the idle processor if it is not the processor on which the process is bound.

If the process is multithreaded, binding the process will bind all its threads to the same processor. Therefore, the process does not take advantage of the multiprocessing, and performance will not be improved.

Note
Use process binding with care, because it disrupts the natural load balancing provided by AIX Version 4, and the overall performance of the system could degrade. If the workload of the machine changes from that which is monitored when making the initial binding, system performance can suffer. If you use the bindprocessor command, take care to monitor the machine regularly because the environment might change, making the bound process adversely affect system performance.

The lockstat Command

The lockstat command is only available in AIX Version 4.

As described earlier in this chapter, the use of locks and finding the right granularity is one of the big challenges in a MP operating system. You need to have a way to determine if locks are posing a problem on the system (for example, lock contention). The lockstat command displays lock-contention statistics for operating system locks on SMP systems.

To determine whether the lockstat command is installed and available, run the following command:

# lslpp -lI perfagent.tools

Before you use the lockstat command, create as root a new bosboot image with the -L option to enable lock instrumentation. Assume that the boot disk is hdisk0. Run the following:

# bosboot -a -d /dev/hdisk0 -L

After you run the command, reboot the machine to enable lock instrumentation. At this time, you can use the lockstat command to look at the locking activity.

It is only possible to see which kernel locks are generated by the workload. Application locks cannot be seen directly with the lockstat command. However, they can be seen indirectly. In that case, check the application for bottlenecks, such as:

The lockstat command can be CPU-intensive because there is overhead involved with lock instrumentation, which is why it is not turned on by default. The overhead of enabling lock instrumentation is typically 3-5 percent. Also note that trace buffers fill up much quicker when using this option because there are a lot of locks being used.

AIX Version 4 defines subsystems comprised of lock classes in /usr/include/sys/lockname.h. Each time an operating system developer needs to acquire a lock, they pick up or create a lock class which serves to identify the lock.

The lockstat command generates a report for each kernel lock that meets all specified conditions. When no conditions are specified, the default values are used. The following are the parameters that can be used to filter the data collected:

-a
Displays a supplementary list showing the most requested (or active) locks, regardless of the conditions defined by other flags.
-c LockCount
Specifies how many times a lock must be requested during an interval in order to be displayed. A lock request is a lock operation which in some cases cannot be satisfied immediately. All lock requests are counted. The default is 200.
-b BlockRatio
Specifies a block ratio. When a lock request is not satisfied, it is said to be blocked. A lock must have a block ratio that is higher than BlockRatio to appear in the list. The default of BlockRatio is 5 percent.
-nCheckCount
Specifies the number of locks that are to be checked. The lockstat command sorts locks according to lock activity. This parameter determines how many of the most active locks will be subject to further checking. Limiting the number of locks that are checked maximizes system performance, particularly if the lockstat command is executed in intervals. The default value is 40.
-p LockRate
Specifies a percentage of the activity of the most-requested lock in the kernel. Only locks that are more active than this percentage will be listed. The default value is 2, which means that the only locks listed are those requested at least 2 percent as often as the most active lock.
-t MaxLocks
Specifies the maximum number of locks to be displayed. The default is 10.

If the lockstat command is executed with no options, an output similar to the following is displayed:

# lockstat
Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        259   75356   37.49    9.44    0.21
PROC    PROC_INT_CLASS            1   12842    6.39   17.75    0.00

The first column is the subsystem (Subsys) to which the lock belongs. Some common subsystems are as follows:

PROC
Scheduler, dispatcher or interrupt handlers
VMM
Pages, segment and free list
TCP
Sockets, NFS
PFS
I-nodes, i-cache

Next, the symbolic name of the lock class is shown. Some common classes are as follows:

TOD_LOCK_CLASS
All interrupts that need the Time-of-Day (TOD) timer
PROC_INT_CLASS
Interrupts for processes
U_TIMER_CLASS
Per-process timer lock
VMM_LOCK_VMKER
Free list
VMM_LOCK_PDT
Paging device table
VMM_LOCK_LV
Per paging space
ICACHE_LOCK_CLASS
I-node cache

The other columns are as follows:

As a guideline, be concerned if a lock has a reference number (Ref/s) above 10000. In the example, both classes shown present a very high rate. In this case, you may want to use the vmstat command to investigate further. Refer to The vmstat Command for more information. If the output of the vmstat command shows a significant amount of CPU idle time when the system seems subjectively to be running slowly, delays might be due to kernel lock contention, because lock requestors go into blocked mode. Lock contentions cause wasted cycles because a thread may be spinning on a busy lock or sleeping until the lock is granted. Improper designs may even lead to deadlocks. The wasted cycles would degrade system performance.

The lockstat command output does not indicate exactly which application is causing a problem to the system. The lock-contentions problem can only be solved at the source-code level. For example, if your application has a high number of processes that read and write a unique message queue, you might have lock contention for the inter-process communication (IPC) subsystem. Adding more message queues may reduce the level of lock contention.

In this example, many instances of a process that opens the same file for read-only were running simultaneously on the system. In the operating system, every time a file is accessed, its i-node is updated with the last access time. That is the reason for the high reference number observed for the lock class IRDWR_LOCK_CLASS. Many threads were trying to update the i-node of the same file concurrently.

When the lockstat command is run without options, only the locks with %Block above 5 percent are listed. You can change this behavior by specifying another BlockRatio with the -b option, as follows:

# lockstat -b 1
Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        258   95660   60.22   69.15    0.16
PROC    PROC_INT_CLASS            1    5798    3.65    4.73    0.00
PROC    PROC_INT_CLASS            2    2359    1.48    1.02    0.00

In this case, all the lock requests with %Block above 1 percent will be shown.

If no lock has a BlockRatio within the given range, the output would be as follows:

# lockstat
No Contention

However, this might also indicate that the lock instrumentation has not been activated.

The -a option additionally lists the 10 most-requested (or active) locks, as follows:

# lockstat -a
Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        259   75356   37.49    9.44    0.21
PROC    PROC_INT_CLASS            1   12842    6.39   17.75    0.00

First 10 largest reference rate locks :

Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        259   75356   37.49    9.44    0.21
PROC    PROC_INT_CLASS            1   12842    6.39   17.75    0.00
PROC    TOD_LOCK_CLASS           --    5949    2.96    1.68    0.00
PROC    PROC_INT_CLASS            2    5288    2.63    3.97    0.00
XPSE    PSE_OPEN_LOCK            --    4498    2.24    0.87    0.00
IOS     SELPOLL_LOCK_CLASS       --    4276    2.13    3.20    0.00
XPSE    PSE_SQH_LOCK             95    4223    2.10    0.62    0.00
XPSE    PSE_SQH_LOCK            105    4213    2.10    0.50    0.00
XPSE    PSE_SQH_LOCK             75    3585    1.78    0.31    0.00
XPTY    PTY_LOCK_CLASS            6    3336    1.66    0.00    0.00

The meaning of the fields is the same as in the previous example. The first table is a list of locks with %Block above 5 percent. A list of the top 10 reference-rate locks, sorted in decreasing order, is then provided. The number of locks in the most-requested list can be changed with the -t option, as follows:

# lockstat -a -t 3
Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        259   75356   37.49    9.44    0.21
PROC    PROC_INT_CLASS            1   12842    6.39   17.75    0.00

First 3 largest reference rate locks :

Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

PFS     IRDWR_LOCK_CLASS        259   75356   37.49    9.44    0.21
PROC    PROC_INT_CLASS            1   12842    6.39   17.75    0.00
PROC    TOD_LOCK_CLASS           --    5949    2.96    1.68    0.00

In the previous example, the -t option specifies that only the top three reference-rate locks will be shown.

If the output of the lockstat -a command looks similar to the following:

No Contention

First 10 largest reference rate locks :

Subsys  Name                     Ocn   Ref/s   %Ref   %Block  %Sleep
____________________________________________________________________

then an empty most-requested lock list means that the lock instrumentation has not been enabled. It can be enabled by executing the bosboot command as explained at the beginning of this section.

The lockstat command can also be run in intervals, as follows:

# lockstat 10 100

The first number passed in the command line specifies the amount of time (in seconds) between each report. Each report contains statistics collected during the interval since the previous report. If no interval is specified, the system gives information covering an interval of one second and then exits. The second number determines the number of reports generated. The second number can only be specified if an interval is given.

Note
Under excessive lock contention on large SMPs, the lockstat command does not scale well and might not return in the time period specified.

The schedtune -s Command

If a thread wants to acquire a lock when another thread currently owns that lock and is running on another CPU, the thread that wants the lock will spin on the CPU until the owner thread releases the lock. Prior to AIX 4.3.1, this thread would spin indefinitely. In AIX 4.3.1, the thread spins up to a certain value as specified by a tunable parameter called MAXSPIN.

The default value of MAXSPIN was previously 0xFFFFFFFF (the hexadecimal representation of a very large number in decimal form) on SMP systems and 1 on UP systems. In AIX 4.3.1, the default value of MAXSPIN is 0x4000 (16384) for SMP systems and remains at 1 on UP systems. If you notice more idle or I/O wait time on a system that had not shown this previously, it could be that threads are going to sleep more often. If this is causing a performance problem, then tune MAXSPIN such that it is a higher value or set to -1 which means to spin up to 0xFFFFFFFF times.

To revise the number of times to spin before going to sleep use the -s option of the schedtune command. To reduce CPU usage that might be caused by excessive spins, reduce the value of MAXSPIN as follows:

# /usr/samples/kernel/schedtune -s 8192

You may observe an increase in context-switching. If context-switching becomes the bottleneck, increase MAXSPIN.

To determine whether the schedtune command is installed and available, run the following command:

# lslpp -lI bos.adt.samples

To change the value, you must be the root user.

[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]