[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]
Commands Reference, Volume 5
schedo Command
Purpose
Manages CPU scheduler tunable parameters.
Syntax
schedo [ -p | -r ] { -o Tunable[=Newvalue]}
schedo [ -p | -r ] { -d Tunable }
schedo [ -p | -r ] -D
schedo [ -p | -r -a
schedo -?
schedo -h Tunable
schedo -L [Tunable ]
Note
Multiple flags -o, -d, and -L flags are allowed
Description
Note
The schedo command can only be executed
by root.
Use the schedo command to configure scheduler tuning
parameters. This command sets or displays current or next boot values for
all scheduler tuning parameters. This command can also make permanent changes
or defer changes until the next reboot. Whether the command sets or displays
a parameter is determined by the accompanying flag. The -o flag performs both actions. It can either display the value of a parameter
or set a new value for a parameter.
Priority-Calculation Parameters
The priority of most user processes varies with the amount of CPU time
the process has used recently. The CPU scheduler's priority calculations are
based on two parameters that are set with schedo, sched_R and sched_D. The sched_R and sched_D values are in thirty-seconds
(1/32); that is, the formula used by the scheduler to calculate the amount
to be added to a process's priority value as a penalty for recent CPU use
is:
CPU penalty = (recently used CPU value of the process) * (r/32)
and the once-per-second recalculation of the recently used CPU value of
each process is:
new recently used CPU value = (old recently used CPU value of the process) * (d/32)
Both r (sched_R parameter) and d (sched_D parameter) have default values of 16. This maintains the CPU
scheduling behavior of previous versions of the operating system. Before experimenting
with these values, you should be familiar with "Tuning the CPU scheduler"
in the Performance Management Guide.
Memory-Load-Control Parameters
The operating system scheduler performs memory load control by suspending
processes when memory is over committed. The system does not swap out processes;
instead pages are stolen as they are needed to fulfill
the current memory requirements. Typically, pages are stolen from suspended
processes. Memory is considered over committed when the following condition
is met:
p * h s |
where:
p is the number of pages written to paging space in the last second
h is an integer specified by the v_repage_hi parameter
s is the number of page steals that have occurred in the last second
|
A process is suspended when memory is over committed and the following
condition is met:
r * p f |
where:
r r is the number of repages that the process has accumulated in the last second
p is an integer specified by the v_repage_proc parameter
f is the number of page faults that the process has experienced in the last second
|
In addition, fixed-priority processes and kernel processes are exempt from
being suspended.
The term repages refers to the number of pages belonging to the process,
which were reclaimed and are soon after referenced again by the process.
The user also can specify a minimum multiprogramming level with the v_min_process
parameter. Doing so ensures that a minimum number of processes remain active
throughout the process-suspension period. Active processes are those that
are runnable and waiting for page I/O. Processes that are waiting for events
and processes that are suspended are not considered active, nor is the wait
process considered active.
Suspended processes can be added back into the mix when the system has
stayed below the over committed threshold for n seconds, where n is specified
by the v_sec_wait parameter. Processes are added back into the system based,
first, on their priority and, second, on the length of their suspension period.
Before experimenting with these values, you should be thoroughly familiar
with "Tuning VMM Memory Load Control" in the Performance Management Guide.
Time-Slice-Increment Parameter
The schedtune command can also be used to change
the amount of time the operating system allows a given process to run before
the dispatcher is called to choose another process to run (the time slice).
The default value for this interval is a single clock tick (10 milliseconds).
The timeslice tuning parameter allows the user to specify the number of clock
ticks by which the time slice length is to be increased.
In AIX Version 4, this parameter only applies to threads with the SCHED_RR
scheduling policy. See Scheduling Policy for Threads.
fork() Retry Interval Parameter
If a fork() subroutine call fails because there is
not enough paging space available to create a new process, the system retries
the call after waiting for a specified period of time. That interval is set
with the pacefork tuning parameter.
Attention: Misuse of this command can cause performance
degradation or operating-system failure. Be sure that you have studied the
appropriate tuning sections in the AIX 5L Version 5.2 Performance Management Guide before using schedo to change system parameters.
Flags
-h Tunable |
Displays help about the Tunable parameter. |
-a |
Displays the current, reboot (when used in conjunction
with -r) or permanent (when used in conjunction with -p) value for all tunable parameters, one per line in pairs Tunable = Value. For the permanent
option, a value is only displayed for a parameter if its reboot and current
values are equal. Otherwise NONE displays as the value. |
-d Tunable |
ResetsTunable to its default
value. If a tunable needs to be changed (that is, it is currently not set
to its default value, and -r is not used in combination,
it won't be changed but a warning is displayed. |
-D |
Resets all tunables to their default value. If tunables
needing to be changed are of type Bosboot or Reboot, or are of type Incremental
and have been changed from their default value, and -r is
not used in combination, they will not be changed but a warning displays. |
-o Tunable [=Newvalue] |
Displays the value or sets Tunable to Newvalue. If a tunable needs to be changed
(the specified value is different than current value), and is of type Bosboot
or Reboot, or if it is of type Incremental and its current value is bigger
than the specified value, and -r is not used in combination,
it will not be changed but a warning displays.
When -r is used in combination without a new value, the nextboot value for
tunable is displayed. When -p is used in combination
without a new value, a value displays only if the current and next boot values
for tunable are the same. Otherwise NONE displays as the value. |
-p |
Makes changes apply to both current and reboot values,
when used in combination with -o, -d or -D, that is, turns on the updating of the /etc/tunables/nextboot file in addition to the updating
of the current value. These combinations cannot be used on Reboot and Bosboot
type parameters because their current value can't be changed.
When used
with -a or -o without specifying
a new value, values are displayed only if the current and next boot values
for a parameter are the same. Otherwise NONE displays as the value. |
-r |
Makes changes apply to reboot values when used in combination
with -o, -d or -D, that is, turns on the updating of the /etc/tunables/nextboot file. If any parameter of type Bosboot is changed, the user will be
prompted to run bosboot.
When used with -a or -o without specifying a new value, next boot values for
tunables display instead of current values. |
-L [ Tunable ] |
Lists the characteristics of one or all tunables, one
per line, using the following format:
Name Current Default Reboot Minimum Maximum Unit Type Dependencies
value value value value value
----------------------------------------------------------------------------
param1 5 2 4 1 10 MB/s I param2
param3
Parameters can be of the following types:
D for Dynamic if tunable can be changed at any time
S for Static if tunable can never be changed
R for Reboot if tunable can only be changed during reboot sequence.
B for Bosboot if tunable can only be changed by running bosboot and rebooting machine
M for Mount if tunable is only effective for filesystems or directory mountings
I for Incremental if tunable can only be incremented, except at boot time
Note
The current set of parameters managed by schedo only includes Dynamic types. |
-? |
Displays the schedo command usage
statement. |
Any change (with -o, -d or -D) to a parameter of type Mount results in a message displaying
to warn the user that the change is only effective for future mountings.
Any attempt to change (with -o, -d or -D) a parameter of type Bosboot or Reboot
without -r, results in an error message.
Any attempt to change (with-o, -d or -D but without -r)
the current value of a parameter of type Incremental with a new value smaller
than the current value, results in an error message.
Compatibility Mode
When running in pre 5.2 compatibility mode (controlled by the pre520tune attribute of sys0, see Introduction
to AIX 5L Version 5.2 tunable parameter setting in the Performance Management
Guide), reboot values for parameters, except those of type Bosboot, are not
really meaningful because in this mode they are not applied at boot time.
In pre 5.2 compatibility mode, setting reboot values to tuning parameters
continues to be achieved by imbedding calls to tuning commands in scripts
called during the boot sequence. Parameters of type Reboot can therefore be set without the -r flag, so that
existing scripts continue to work.
This mode is automatically turned ON when a machine is MIGRATED to AIX
5L Version 5.2. For complete installations, it is turned OFF and the reboot
values for parameters are set by applying the content of the /etc/tunables/nextboot file during the reboot sequence. Only in that
mode are the -r and -p flags fully
functional. See AIX 5L Version 5.2 kernel tuning in the Performance Tools
Guide and Reference for details about the new 5.2 mode.
Tunable Parameters
affinity_lim |
- Purpose:
- Sets the number of intervening dispatches after which the SCHED_FIFO2
policy no longer favors a thread.
- Values:
-
- Default: 7
- Range: 0 to 100
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- Once a thread is running with SCHED_FIFO2 policy, tuning of this variable
may or may not have an effect on the performance of the thread and workload.
Ideal values should be determined by trial and error.
- Refer To:
- Scheduling Policy for Threads
|
idle_migration_barrier |
- Purpose:
- Used to determine when threads can be migrated to other processors.
- Values:
-
- Default: 4
- Range: 0 to 100
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This value is divided by 16 and then multiplied by the load average.
The resulting value is used to determine if jobs should be migrated to other
nodes (essentially does load balancing).
|
fixed_pri_global |
- Purpose:
- Keep fixed priority threads on global run queue.
- Values:
-
- Default: 4
- Range: 0 to 1
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- If 1, then fixed priority threads are placed on the global run queue.
- Refer To:
- Scheduler Run Queue
|
maxspin |
- Purpose:
- Sets the number of times to spin on a kernel lock before going to sleep.
- Values:
-
- Default: 1 on uniprocessor systems, -1 on MP systems, which means to spin
up to 232 times
- Range: -1 to 232
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- Increasing the value or setting it to -1 on MP systems may reduce idle
time; however, it may also waste CPU time in some situations. Increasing it
on uniprocessor systems is not recommended.
- Refer To:
- The schedtune -s Command
|
pacefork |
- Purpose:
- The number of clock ticks to wait before retrying a failed fork call
that has failed for lack of paging space.
- Values:
-
- Default: 10
- Range: a positive number of clock ticks bigger than 10
- Type: Dynamic
- Diagnosis:
- System is running out of paging space and process cannot be forked.
- Tuning:
- The system will retry a failed fork five times. For example, if a fork() subroutine call fails because there is not enough
paging space available to create a new process, the system retries the call
after waiting the specified number of clock ticks.
- Refer To:
- Tuning the fork() Retry Interval Parameter
with schedtune
|
sched_D |
- Purpose:
- Sets the short term CPU usage delay rate.
- Values:
-
- Default: 16
- Range: 0 to 32
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- The default is to decay short-term CPU usage by 1/2 (16/32) every second.
Decreasing this value enables foreground processes to avoid competition with
background processes for a longer time.
- Refer To:
- Tuning the Thread-Priority-Value Calculation
|
sched_R |
- Purpose:
- Sets the weighting factor for short-term CPU usage in priority calculations.
- Values:
-
- Default: 16
- Range: 0 to 32
- Type: Dynamic
- Diagnosis:
- Run: ps al. If you find that the PRI column has
priority values for foreground processes (those with NI values of 20) that
are higher than the PRI values of some background processes (NI values > 20),
you can reduce the r value.
- Tuning:
- The default is to include 1/2 (16/32) of the short term CPU usage in
the priority calculation. Decreasing this value makes it easier for foreground
processes to compete.
- Refer To:
- Tuning the Thread-Priority-Value Calculation
|
timeslice |
- Purpose:
- The number of clock ticks a thread can run before it is put back on
the run queue.
- Values:
-
- Default: 1
- Range: a positive integer value
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- Increasing this value can reduce overhead of dispatching threads. The
value refers to the total number of clock ticks in a timeslice and only affects
fixed-priority processes.
- Refer To:
- Modifying the Scheduler Time Slice with the
schedtune Command
|
%usDelta |
- Purpose:
- Used to adjust system clock with each clock tick in the correction range
-1 to +1 seconds.
- Values:
-
- Default: 100
- Range: 0 to 100
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This is used to adjust clock drifts.
|
v_exempt_secs |
- Purpose:
- Sets the number of seconds that a recently resumed process that was
previously suspended is exempt from suspension.
- Values:
-
- Default: 2
- Range: 0 or a positive number
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This parameter is only examined if thrashing is occurring.
- Refer To:
- VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
|
v_min_process |
- Purpose:
- Sets the minimum number of processes that are exempt from suspension.
- Values:
-
- Default: 2
- Range: 0 or a positive number
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This number is in addition to kernel processes, processes with fixed
priority less than 60, processes with pinned memory, or processes awaiting
events. This parameter is only examined if there are threads on the suspended
queue.
- Refer To:
- VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
|
v_repage_hi |
- Purpose:
- Sets the system wide criteria used to determine when process suspension
begins and ends (system is thrashing).
- Values:
-
- Default: 6 unless system RAM is 128 MB or more (in this case it is 0)
- Range: 0 or a positive number
- Type: Dynamic
- Diagnosis:
- If v_repage_hi * page_outs/sec is > page_steals, then processes may
get suspended.
- Tuning:
- If system is paging and causing scheduler to think it is thrashing but
thrashing is not actually occurring, then it may be useful to desensitize
the algorithm by decreasing the -h value or setting
it to 0.
- Refer To:
- VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
|
v_repage_proc |
- Purpose:
- Sets the per-process criterion used to determine which processes to
suspend.
- Values:
-
- Default: 4
- Range: 0 or a positive number
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This requires a higher level of repaging by a given process before it
is a candidate for suspension by memory load control. This parameter is examined
only if thrashing is occurring.
- Refer To:
- VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
|
v_sec_wait |
- Purpose:
- Sets the number of seconds to wait after thrashing ends before making
suspended processes runnable.
- Values:
-
- Default: 1
- Range: 0 or a positive number
- Type: Dynamic
- Diagnosis:
- N/A
- Tuning:
- This parameter is examined only if thrashing is occurring.
- Refer To:
- VMM Memory Load Control Facility and Tuning VMM Memory Load Control with the schedtune Command
|
Examples
- To list the current and reboot value, range, unit, type and dependencies
of all tunables parameters managed by the schedo command, type:
schedo -L
- To reset v_sec_wait to default, type:
schedo -d v_sec_wait
- To display help on sched_R, type:
schedo -h sched_R
- To set v_min_process to 4 after the next reboot, type:
schedo -r -o v_min_process=4
- To permanently reset all schedo tunable parameters to default, type:
schedo -p -D
- To list the reboot value for all schedo parameters, type:
schedo -r -a
Related Information
The vmo command, ioo command, no command, nfso command, tunsave command, tunrestore command, tuncheck command, andtundefault command.
Kernel Tuning in AIX 5L Version 5.2 Performance Tools Guide and Reference.
[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]