IBM Run-time Options for Parallel Processing (C Only)

Run-time time options affecting parallel processing are specified in the XLSMPOPTS environment variable. This environment variable, which must be set before you run an application, uses syntax of form:

XLSMPOPTS=option_and_args[:option_and_args][ ... ]

Parallelization run-time options can also be specified using OMP environment variables. When OMP and XLSMPOPTS run-time options conflict, OMP options will prevail.

Run-time options fall into different categories as described below.

Scheduling Algorithm Options
schedule=algorith=[n]
This option specifies the scheduling algorithm used for loops not explictly assigned a scheduling alogorithm with the ibm schedule pragma.

Valid options for algorithm are:

  • guided
  • affinity
  • dynamic
  • static

If specified, the value of n must be an integer value of 1 or greater.

The default is scheduling algorithm is static.

See #pragma ibm schedule Preprocessor Directive for a description of these algorithms.

Parallel Environment Options
parthds=num
num represents the number of parallel threads requested, which is usually equivalent to the number of processors available on the system.

Some applications cannot use more threads than the maximum number of processors available. Other applications can experience significant performance improvements if they use more threads than there are processors. This option gives you full control over the number of user threads used to run your program.

The default value for num is the number of processors available on the system.

usrthds=num
num represents the number of user threads expected.

This option should be used if the program code explicitly creates threads, in which case num should be set to the number of threads created.

The default value for num is 0.

stack=num
num specifies the largest amount of space required for a thread's stack.

The default value for num is 32768.

Performance Tuning Options
spins=num
num represents the number of loop spins before a yield occurs.

When a thread completes its work, the thread continues executing in a tight loop looking for new work. One complete scan of the work queue is done during each busy-wait state. An extended busy-wait state can make a particular application highly responsive, but can also harm the overall responsiveness of the system unless the thread is given instructions to periodically scan for and yield to requests from other applications.

A complete busy-wait state for benchmarking purposes can be forced by setting both spins and yields to 0.

The default value for num is 100.

yields=num
num represents the number of yields before a sleep occurs.

When a thread sleeps, it completely suspends execution until another thread signals that there is work to do. This provides better system utilization, but also adds extra system overhead for the application.

The default value for num is 100.

delays=num
num represents a period of do-nothing delay time between each scan of the work queue. Each unit of delay is achieved by running a single no-memory-access delay loop.

The default value for num is 500.

Dynamic Profiling Options
profilefreq=num
num represents the sampling rate at which each loop is revisited to determine appropriateness for parallel processing.

The run-time library uses dynamic profiling to dynamically tune the performance of automatically-parallelized loops. Dynamic profiling gathers information about loop running times to determine if the loop should be run sequentially or in parallel the next time through. Threshold running times are set by the parthreshold and seqthreshold dynamic profiling options, described below.

If num is 0, all profiling is turned off, and overheads that occur because of profiling will not occur. If num is greater than 0, running time of the loop is monitored once every num times through the loop.

The default for num is 16. The maximum sampling rate is 32. Higher values of num are changed to 32.

parthreshold=mSec
mSec specifies the expected running time in milliseconds below which a loop must be run sequentially. mSec can be specified using decimal places.

If parthreshold is set to 0, a parallelized loop will never be serialized by the dynamic profiler.

The default value for mSec is 0.2 milliseconds.

seqthreshold=mSec
mSec specifies the expected running time in milliseconds beyond which a loop that has been serialized by the dynamic profiler must revert to being run in parallel mode again. mSec can be specified using decimal places.

The default value for mSec is 5 milliseconds.

Note: You must use thread-safe compiler mode invocations when compiling parallelized program code.


Program Parallelization
Shared and Private Variables in a Parallel Environment
Countable Loops
Compiler Modes


Invoke the Compiler


#pragma Preprocessor Directives for Parallel Processing
Built-in Functions Used for Parallel Processing
OpenMP Run-time Options for Parallel Processing
smp Compiler Option