Performance Overview of the AIX CPU Scheduler

AIX Versions 3.2 and 4 Performance Tuning Guide

Performance Overview of the AIX CPU Scheduler

The addition of thread support to AIX Version 4 has resulted in extensive changes to the CPU scheduler. Conceptually, the scheduling algorithm and priority scheme are similar to those of AIX Version 3.2.5, but the addition of thread support required many detail-level changes. Although the net behavioral change for unchanged applications running on uniprocessors may be small, anyone concerned with performance tuning should understand the changes and the opportunities.

AIX Version 4 Thread Support

A thread can be thought of as a low-overhead process. It is a dispatchable entity that requires fewer resources to create than an AIX process. The fundamental dispatchable entity of the AIX Version 4 scheduler is the thread.

This does not mean that processes have ceased to exist. In fact, workloads migrated directly from earlier releases of AIX will create and manage processes as before. Each new process will be created with a single thread that has its parent process's priority and contends for the CPU with the threads of other processes. The process owns the resources used in execution; the thread owns only its current state.

When new or modified applications take advantage of AIX thread support to create additional threads, those threads are created within the context of the process. They share the process's private segment and other resources.

A user thread within a process has specified contention scope. If the contention scope is global, the thread contends for CPU time with all other threads in the system. (The thread that is created when a process is created has global contention scope.) If the contention scope is local, the thread contends with the other threads within the process to be the recipient of the process's share of CPU time.

The algorithm for determining which thread should be run next is called a scheduling policy.

Scheduling Policy for Threads with Local or Global Contention Scope

In AIX Version 4 there are three possible values for thread scheduling policy :

FIFO	Once a thread with this policy is scheduled, it runs to completion unless it is blocked, it voluntarily yields control of the CPU, or a higher-priority thread becomes dispatchable. Only fixed-priority threads can have a FIFO scheduling policy.
RR	This is similar to the AIX Version 3 scheduler round-robin scheme based on 10ms time slices. When a RR thread has control at the end of the time slice, it moves to the tail of the queue of dispatchable threads of its priority. Only fixed-priority threads can have a RR scheduling policy.
OTHER	This policy is defined by POSIX1003.4a as implementation-defined. In AIX Version 4, this policy is defined to be equivalent to RR, except that it applies to threads with non-fixed priority. The recalculation of the running thread's priority value at each clock interrupt means that a thread may lose control because its priority value has risen above that of another dispatchable thread. This is the AIX Version 3 behavior.

Threads are primarily of interest for applications that currently consist of several asynchronous processes. These applications might impose a lighter load on the system if converted to a multithread structure.

Process and Thread Priority

The priority management tools in AIX Version 3.2.5 manipulate process priority. In AIX Version 4, process priority is simply a precursor to thread priority. When fork() is called, a process and a thread to run in it are created. The thread has the priority that would have been attributed to the process in Version 3.2.5. The following general discussion applies to both versions.

The kernel maintains a priority value (sometimes termed the scheduling priority) for each thread. The priority value is a positive integer and varies inversely with the importance of the associated thread. That is, a smaller priority value indicates a more important thread. When the scheduler is looking for a thread to dispatch, it chooses the dispatchable thread with the smallest priority value.

A thread can be fixed-priority or nonfixed priority. The priority value of a fixed-priority thread is constant, while the priority value of a nonfixed priority thread is the sum of the minimum priority level for user threads (a constant 40), the thread's nice value (20 by default, optionally set by the nice or renice command), and its CPU-usage penalty. The figure "How the Priority Value is Determined" illustrates some of the ways in which the priority value can change.

The nice value of a thread is set when the thread is created and is constant over the life of the thread, unless explicitly changed by the user via the renice command or the setpri , setpriority, or nice system calls.

The CPU penalty is an integer that is calculated from the recent CPU usage of a thread. The recent CPU usage increases by 1 each time the thread is in control of the CPU at the end of a 10ms clock tick, up to a maximum value of 120. Once per second, the recent CPU usage values for all threads are reduced. The result is that:

The priority of a nonfixed-priority thread decreases as its recent CPU usage increases and vice versa. This implies that, on average, the more time slices a thread has been allocated recently, the less likely it is that the thread will be allocated the next time slice.
The priority of a nonfixed-priority thread decreases as its nice value increases, and vice versa.

The priority of a thread can be fixed at a certain value via the setpri subroutine. The priority value, nice value, and short-term CPU-usage values for a process can be displayed with the ps command.

See "Controlling Contention for the CPU" for a more detailed discussion of the use of the nice and renice commands.

See "Tuning the Process-Priority-Value Calculation with schedtune", for the details of the calculation of the CPU penalty and the decay of the recent CPU usage values.

AIX Scheduler Run Queue

The scheduler maintains a run queue of all of the threads that are ready to be dispatched. The figure labelled "Run Queue" depicts the run queue symbolically.

All the dispatchable threads of a given priority occupy consecutive positions in the run queue.

When a thread is moved to the "end of the run queue" (for example, when the thread has control at the end of a time slice), it is moved to a position after the last thread in the queue that has the same priority value.

Scheduler CPU Time Slice

The CPU time slice is the period between recalculations of the priority value. Normally, recalculation is done at each tick of the system clock, that is, every 10 milliseconds. The -t option of the schedtune command can be used to increase the number of clock ticks between recalculations, increasing the length of the time slice by 10 millisecond increments. Keep in mind that the time slice is not a guaranteed amount of processor time. It is the longest time that a thread can be in control before it faces the possibility of being replaced by another thread. There are many ways in which a thread can lose control of the CPU before it has had control for a full time slice.

Related Information

"AIX Resource Management Overview"

"Monitoring and Tuning CPU Use"

The nice command, ps command, renice command.

The setpri subroutine.

The getpriority, setpriority, or nice subroutines.