[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]

Commands Reference, Volume 5

schedo Command

Purpose

Manages CPU scheduler tunable parameters.

Syntax

schedo [ -p | -r ] { -o Tunable[=Newvalue]}

schedo [ -p | -r ] { -d Tunable }

schedo [ -p | -r ] -D

schedo [ -p | -r -a

schedo -?

schedo -h Tunable

schedo -L [Tunable ]

Note
Multiple flags -o, -d, and -L flags are allowed

Description

Note
The schedo command can only be executed by root.

Use the schedo command to configure scheduler tuning parameters. This command sets or displays current or next boot values for all scheduler tuning parameters. This command can also make permanent changes or defer changes until the next reboot. Whether the command sets or displays a parameter is determined by the accompanying flag. The -o flag performs both actions. It can either display the value of a parameter or set a new value for a parameter.

Priority-Calculation Parameters

The priority of most user processes varies with the amount of CPU time the process has used recently. The CPU scheduler's priority calculations are based on two parameters that are set with schedo, sched_R and sched_D. The sched_R and sched_D values are in thirty-seconds (1/32); that is, the formula used by the scheduler to calculate the amount to be added to a process's priority value as a penalty for recent CPU use is:

CPU penalty = (recently used CPU value of the process) * (r/32)

and the once-per-second recalculation of the recently used CPU value of each process is:

new recently used CPU value = (old recently used CPU value of the process) * (d/32)

Both r (sched_R parameter) and d (sched_D parameter) have default values of 16. This maintains the CPU scheduling behavior of previous versions of the operating system. Before experimenting with these values, you should be familiar with "Tuning the CPU scheduler" in the Performance Management Guide.

Memory-Load-Control Parameters

The operating system scheduler performs memory load control by suspending processes when memory is over committed. The system does not swap out processes; instead pages are stolen as they are needed to fulfill the current memory requirements. Typically, pages are stolen from suspended processes. Memory is considered over committed when the following condition is met:

p * h s where:
p is the number of pages written to paging space in the last second
h is an integer specified by the v_repage_hi parameter
s is the number of page steals that have occurred in the last second

A process is suspended when memory is over committed and the following condition is met:

r * p f where:
r r is the number of repages that the process has accumulated in the last second
p is an integer specified by the v_repage_proc parameter
f is the number of page faults that the process has experienced in the last second

In addition, fixed-priority processes and kernel processes are exempt from being suspended.

The term repages refers to the number of pages belonging to the process, which were reclaimed and are soon after referenced again by the process.

The user also can specify a minimum multiprogramming level with the v_min_process parameter. Doing so ensures that a minimum number of processes remain active throughout the process-suspension period. Active processes are those that are runnable and waiting for page I/O. Processes that are waiting for events and processes that are suspended are not considered active, nor is the wait process considered active.

Suspended processes can be added back into the mix when the system has stayed below the over committed threshold for n seconds, where n is specified by the v_sec_wait parameter. Processes are added back into the system based, first, on their priority and, second, on the length of their suspension period.

Before experimenting with these values, you should be thoroughly familiar with "Tuning VMM Memory Load Control" in the Performance Management Guide.

Time-Slice-Increment Parameter

The schedtune command can also be used to change the amount of time the operating system allows a given process to run before the dispatcher is called to choose another process to run (the time slice). The default value for this interval is a single clock tick (10 milliseconds). The timeslice tuning parameter allows the user to specify the number of clock ticks by which the time slice length is to be increased.

In AIX Version 4, this parameter only applies to threads with the SCHED_RR scheduling policy. See Scheduling Policy for Threads.

fork() Retry Interval Parameter

If a fork() subroutine call fails because there is not enough paging space available to create a new process, the system retries the call after waiting for a specified period of time. That interval is set with the pacefork tuning parameter.

Attention: Misuse of this command can cause performance degradation or operating-system failure. Be sure that you have studied the appropriate tuning sections in the AIX 5L Version 5.2 Performance Management Guide before using schedo to change system parameters.

Flags

-h Tunable Displays help about the Tunable parameter.
-a Displays the current, reboot (when used in conjunction with -r) or permanent (when used in conjunction with -p) value for all tunable parameters, one per line in pairs Tunable = Value. For the permanent option, a value is only displayed for a parameter if its reboot and current values are equal. Otherwise NONE displays as the value.
-d Tunable ResetsTunable to its default value. If a tunable needs to be changed (that is, it is currently not set to its default value, and -r is not used in combination, it won't be changed but a warning is displayed.
-D Resets all tunables to their default value. If tunables needing to be changed are of type Bosboot or Reboot, or are of type Incremental and have been changed from their default value, and -r is not used in combination, they will not be changed but a warning displays.
-o Tunable [=Newvalue] Displays the value or sets Tunable to Newvalue. If a tunable needs to be changed (the specified value is different than current value), and is of type Bosboot or Reboot, or if it is of type Incremental and its current value is bigger than the specified value, and -r is not used in combination, it will not be changed but a warning displays.

When -r is used in combination without a new value, the nextboot value for tunable is displayed. When -p is used in combination without a new value, a value displays only if the current and next boot values for tunable are the same. Otherwise NONE displays as the value.

-p Makes changes apply to both current and reboot values, when used in combination with -o, -d or -D, that is, turns on the updating of the /etc/tunables/nextboot file in addition to the updating of the current value. These combinations cannot be used on Reboot and Bosboot type parameters because their current value can't be changed.

When used with -a or -o without specifying a new value, values are displayed only if the current and next boot values for a parameter are the same. Otherwise NONE displays as the value.

-r Makes changes apply to reboot values when used in combination with -o, -d or -D, that is, turns on the updating of the /etc/tunables/nextboot file. If any parameter of type Bosboot is changed, the user will be prompted to run bosboot.

When used with -a or -o without specifying a new value, next boot values for tunables display instead of current values.

-L [ Tunable ] Lists the characteristics of one or all tunables, one per line, using the following format:
Name   Current  Default  Reboot   Minimum  Maximum  Unit  Type  Dependencies
       value    value    value    value    value
----------------------------------------------------------------------------

param1 5        2        4        1        10       MB/s     I  param2
                                                                param3

Parameters can be of the following types:
D for Dynamic if tunable can be changed at any time
S for Static if tunable can never be changed
R for Reboot if tunable can only be changed during reboot sequence.
B for Bosboot if tunable can only be changed by running bosboot and rebooting machine
M for Mount if tunable is only effective for filesystems or directory mountings
I for Incremental if tunable can only be incremented, except at boot time

Note
The current set of parameters managed by schedo only includes Dynamic types.
-? Displays the schedo command usage statement.

Any change (with -o, -d or -D) to a parameter of type Mount results in a message displaying to warn the user that the change is only effective for future mountings.

Any attempt to change (with -o, -d or -D) a parameter of type Bosboot or Reboot without -r, results in an error message.

Any attempt to change (with-o, -d or -D but without -r) the current value of a parameter of type Incremental with a new value smaller than the current value, results in an error message.

Compatibility Mode

When running in pre 5.2 compatibility mode (controlled by the pre520tune attribute of sys0, see Introduction to AIX 5L Version 5.2 tunable parameter setting in the Performance Management Guide), reboot values for parameters, except those of type Bosboot, are not really meaningful because in this mode they are not applied at boot time.

In pre 5.2 compatibility mode, setting reboot values to tuning parameters continues to be achieved by imbedding calls to tuning commands in scripts called during the boot sequence. Parameters of type Reboot can therefore be set without the -r flag, so that existing scripts continue to work.

This mode is automatically turned ON when a machine is MIGRATED to AIX 5L Version 5.2. For complete installations, it is turned OFF and the reboot values for parameters are set by applying the content of the /etc/tunables/nextboot file during the reboot sequence. Only in that mode are the -r and -p flags fully functional. See AIX 5L Version 5.2 kernel tuning in the Performance Tools Guide and Reference for details about the new 5.2 mode.

Tunable Parameters

affinity_lim
Purpose:
Sets the number of intervening dispatches after which the SCHED_FIFO2 policy no longer favors a thread.
Values:
  • Default: 7
  • Range: 0 to 100
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
Once a thread is running with SCHED_FIFO2 policy, tuning of this variable may or may not have an effect on the performance of the thread and workload. Ideal values should be determined by trial and error.
Refer To:
Scheduling Policy for Threads
idle_migration_barrier
Purpose:
Used to determine when threads can be migrated to other processors.
Values:
  • Default: 4
  • Range: 0 to 100
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This value is divided by 16 and then multiplied by the load average. The resulting value is used to determine if jobs should be migrated to other nodes (essentially does load balancing).
fixed_pri_global
Purpose:
Keep fixed priority threads on global run queue.
Values:
  • Default: 4
  • Range: 0 to 1
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
If 1, then fixed priority threads are placed on the global run queue.
Refer To:
Scheduler Run Queue
maxspin
Purpose:
Sets the number of times to spin on a kernel lock before going to sleep.
Values:
  • Default: 1 on uniprocessor systems, -1 on MP systems, which means to spin up to 232 times
  • Range: -1 to 232
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
Increasing the value or setting it to -1 on MP systems may reduce idle time; however, it may also waste CPU time in some situations. Increasing it on uniprocessor systems is not recommended.
Refer To:
The schedtune -s Command
pacefork
Purpose:
The number of clock ticks to wait before retrying a failed fork call that has failed for lack of paging space.
Values:
  • Default: 10
  • Range: a positive number of clock ticks bigger than 10
  • Type: Dynamic
Diagnosis:
System is running out of paging space and process cannot be forked.
Tuning:
The system will retry a failed fork five times. For example, if a fork() subroutine call fails because there is not enough paging space available to create a new process, the system retries the call after waiting the specified number of clock ticks.
Refer To:
Tuning the fork() Retry Interval Parameter with schedtune
sched_D
Purpose:
Sets the short term CPU usage delay rate.
Values:
  • Default: 16
  • Range: 0 to 32
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
The default is to decay short-term CPU usage by 1/2 (16/32) every second. Decreasing this value enables foreground processes to avoid competition with background processes for a longer time.
Refer To:
Tuning the Thread-Priority-Value Calculation
sched_R
Purpose:
Sets the weighting factor for short-term CPU usage in priority calculations.
Values:
  • Default: 16
  • Range: 0 to 32
  • Type: Dynamic
Diagnosis:
Run: ps al. If you find that the PRI column has priority values for foreground processes (those with NI values of 20) that are higher than the PRI values of some background processes (NI values > 20), you can reduce the r value.
Tuning:
The default is to include 1/2 (16/32) of the short term CPU usage in the priority calculation. Decreasing this value makes it easier for foreground processes to compete.
Refer To:
Tuning the Thread-Priority-Value Calculation
timeslice
Purpose:
The number of clock ticks a thread can run before it is put back on the run queue.
Values:
  • Default: 1
  • Range: a positive integer value
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
Increasing this value can reduce overhead of dispatching threads. The value refers to the total number of clock ticks in a timeslice and only affects fixed-priority processes.
Refer To:
Modifying the Scheduler Time Slice with the schedtune Command
%usDelta
Purpose:
Used to adjust system clock with each clock tick in the correction range -1 to +1 seconds.
Values:
  • Default: 100
  • Range: 0 to 100
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This is used to adjust clock drifts.
v_exempt_secs
Purpose:
Sets the number of seconds that a recently resumed process that was previously suspended is exempt from suspension.
Values:
  • Default: 2
  • Range: 0 or a positive number
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This parameter is only examined if thrashing is occurring.
Refer To:
VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
v_min_process
Purpose:
Sets the minimum number of processes that are exempt from suspension.
Values:
  • Default: 2
  • Range: 0 or a positive number
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This number is in addition to kernel processes, processes with fixed priority less than 60, processes with pinned memory, or processes awaiting events. This parameter is only examined if there are threads on the suspended queue.
Refer To:
VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
v_repage_hi
Purpose:
Sets the system wide criteria used to determine when process suspension begins and ends (system is thrashing).
Values:
  • Default: 6 unless system RAM is 128 MB or more (in this case it is 0)
  • Range: 0 or a positive number
  • Type: Dynamic
Diagnosis:
If v_repage_hi * page_outs/sec is > page_steals, then processes may get suspended.
Tuning:
If system is paging and causing scheduler to think it is thrashing but thrashing is not actually occurring, then it may be useful to desensitize the algorithm by decreasing the -h value or setting it to 0.
Refer To:
VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
v_repage_proc
Purpose:
Sets the per-process criterion used to determine which processes to suspend.
Values:
  • Default: 4
  • Range: 0 or a positive number
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This requires a higher level of repaging by a given process before it is a candidate for suspension by memory load control. This parameter is examined only if thrashing is occurring.
Refer To:
VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
v_sec_wait
Purpose:
Sets the number of seconds to wait after thrashing ends before making suspended processes runnable.
Values:
  • Default: 1
  • Range: 0 or a positive number
  • Type: Dynamic
Diagnosis:
N/A
Tuning:
This parameter is examined only if thrashing is occurring.
Refer To:
VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command

Examples

  1. To list the current and reboot value, range, unit, type and dependencies of all tunables parameters managed by the schedo command, type:
    schedo -L  
  2. To reset v_sec_wait to default, type:
    schedo -d v_sec_wait 
  3. To display help on sched_R, type:
    schedo -h sched_R
  4. To set v_min_process to 4 after the next reboot, type:
    schedo -r -o v_min_process=4 
  5. To permanently reset all schedo tunable parameters to default, type:
    schedo -p -D 
  6. To list the reboot value for all schedo parameters, type:
    schedo -r -a

Related Information

The vmo command, ioo command, no command, nfso command, tunsave command, tunrestore command, tuncheck command, andtundefault command.

Kernel Tuning in AIX 5L Version 5.2 Performance Tools Guide and Reference.

[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]