[ Previous | Next | Contents | Glossary | Home | Search ]
AIX Versions 3.2 and 4 Performance Tuning Guide

Tuning Asynchronous Connections for High-Speed Transfers

Async ports permit the connection to a computer of optional devices such as terminals, printers, fax machines, and modems. Async ports are provided by adapter devices such as the 8-, 16-, or 64-port IBM adapters or the 128-port Digiboard adapter, which reside on the Micro Channel and provide multiple asynchronous connections, typically RS232 or RS422. Many adapters, such as the three IBM async adapters mentioned above, were originally designed for servicing terminals and printers, and so are optimized for output (sends). Input processing (receives) is not as well optimized, perhaps because the assumption was once made that users could not type very fast. This is not a great concern when data transmission is slow and irregular, as with keyboard input. It becomes a problem with raw-mode applications, where massive chunks of input are transmitted by other computers and by devices such as fax machines.

This section discusses the performance of the various adapters when receiving and sending raw-mode file transfers. While some adapters have inherent limitations, we provide some guidelines and methods that can squeeze out better performance from those adapters for raw-mode transfers.

Measurement Objectives and Configurations

Our measurements had two objectives: to evaluate throughput, effective baud rate, and CPU utilization at various baud rates for the adapters and to determine the maximum number of ports that could be supported by each device at each baud rate.

Note: Our throughput measurements were made using raw-mode file-transfer workloads and are mainly useful for estimating performance of raw-mode devices, like fax machines and modems. These measurements do not apply to commercial multiuser configurations, which may incur significant CPU overhead for database accesses or screen control and are often gated by disk-I/O limitations.

In raw-mode processing, data is treated as a continuous stream; input bytes are not assembled into lines, and erase and kill processing are disabled. A minimum data-block size and a read timer are used to determine how the operating system processes the bytes received before passing them to the application.

Measurements were performed on the native, 8-, 16-, and 64-port adapters at 2400-, 9600-, 19,200- and 38,400-baud line speeds. (Because RS/6000 native async ports, the 8-port adapter, and the 16-port adapter are all serviced by the same device driver and have similar performance, they are referred to as one, the 8/16-port adapter.) The 128-port adapter was measured only at 19,200 and 38,400 baud.

All ports tested were configured and optimized as fast ports for raw-mode transfers (see the fastport.s shell script). A 128,000-character file was written on each TTY line by the driver, a RS/6000 Model 530, and simultaneously read by the system under test, another 530. Each 530 was configured with 32MB of RAM and a single 857MB disk drive.

The AIX performance-monitoring command, iostat (or sar, in the case of the 128-port adapter), was run in the background at a predetermined frequency to monitor system performance. Measurements were taken upon reaching steady state for a fixed interval of time in the flat portion of the throughput curve. In each test, the system load was gradually increased by adding active ports up to some maximum or until 100% CPU utilization was reached.

Three metrics that best describe peak performance characteristics--aggregate character throughput per second (char/sec) averaged over the measured interval, effective per-line baud rate, and CPU utilization--were measured for half-duplex receive and half-duplex send.

XON/XOFF pacing (async handshaking, no relation to AIX disk-I/O pacing), RTS/CTS pacing, and no pacing were tested. Pacing and handshaking refer to hardware or software mechanisms used in data communication to turn off transmission when the receiving device is unable to store the data it is receiving. We found that XON/XOFF pacing was appropriate for the 8/16-port adapters when receiving and for the 128-port adapter both sending and receiving. RTS/CTS was better for the 64-port adapter when receiving. No pacing was better for the 8/16- and 64-port adapters when sending.

Character throughput is the aggregate number of characters transmitted per second across all the lines. Line speeds (or baud rates) of 2400, 9600, 19,200, and 38,400, which are set through the software, are the optimum speed settings for transfer of data over TTY lines. While the baud rate is the peak line speed, measured in bits/second, the effective baud rate is always less, and is calculated as 10 times the character throughput divided by the number of lines. (The factor 10X is used because it takes 10 bits to transfer one 8-bit character.)


The following table summarizes our results. "Max ports:" is the number of ports that can be supported by the adapter when the effective baud rate most closely approaches the line speed.

Line Speed     8/16-port:      64-port:        128-port:
-----------   --------------- --------------- ----------------
              Send    Receive Send    Receive Send     Receive
2400 baud
 Max ports:    32      16      64      64      N/A      N/A
 Char/sec      7700    3800    15200   14720
 Eff. Kb/sec:  2.4     2.4     2.3     2.3
 CPU util. %:  5       32      9       76
9600 baud
 Max ports:    32      12      56      20      128      128
 Char/sec      30700   11500   53200   19200   122200   122700
 Eff. Kb/sec:  9.6     9.6     9.5     9.6     9,6      9.6
 CPU util. %:  17      96      25      99      21       27
19,200 baud
 Max ports:    32      6       32      10      128      128
 Char/sec      48900   11090   51200   18000   245400   245900
 Eff. Kb/sec:  15.3    18.5    16      18      19.2     19.2
 CPU util. %:  35      93      23      92      39       39
38400 baud
 Max ports:    32      4       24      7       75       75
 Char/sec      78400   10550   50400   15750   255200   255600
 Eff. Kb/sec:  24.5    26.4    21      22.5    34       34
 CPU util. %:  68      98      23      81      40       37

The 8/16 Async Port Adapter

8/16 Half-Duplex Send

The 8/16 half-duplex send measurements were made with no pacing, allowing the unimpeded outbound transmission of data. For the 8/16-port adapter, the RS/6000 processes approximately 1400 char/sec per 1% CPU utilization. The peak throughput of a single 16-port adapter is 48,000 char/sec.

8/16 Half-Duplex Receive

In this configuration, using XON/XOFF pacing, the RS/6000 processes about 120 char/sec per 1% CPU. The peak bandwidth is 11,000 char/sec at 100% CPU utilization for the 16-port async adapter.

The 64-Port Async Adapter

The limiting device in 64-port async adapter systems is typically the 16-port concentrator box, of which there can be up to four. Concentrator saturation is a concern because as the concentrator box approaches overload, no additional throughput is accepted. The effective baud rate is lowered, and there is a noticeable slowdown in work. For the following measurements, four 16-port concentrators were connected to the 64 RS232 ports.

64 Half-Duplex Receive

The 64-port half-duplex receive measurements used RTS/CTS hardware pacing. In this configuration, the RS/6000 processes about 195 char/sec per 1% CPU. The peak bandwidth is 19,500 char/sec at 100% CPU utilization.

For half-duplex receive, a single 16-port concentrator box saturates at 8450 char/sec with 44% CPU. Once the concentrator is saturated, no additional throughput is possible until another concentrator is added. At 38,400 baud, the single-concentrator saturation point is four active ports with an effective rate of 22.5 Kbaud. At 19,200 baud the saturation point is five ports with an effective baud rate of 17 Kbaud. At 9600 baud saturation is at nine ports with an effective baud rate of 9.6 Kbaud. At 2400 baud the system supports all 64 ports with an effective baud rate of 2.3 Kbaud with no saturation point. Peak throughput is 14,800 chars/sec.

64 Half-Duplex Send

The 64-port half-duplex send measurements were made with no pacing, allowing the unimpeded outbound transmission of data with no flow-control restrictions. For the 64-port adapter, the RS/6000 processes approximately 2200 char/sec per 1% CPU utilization. The peak throughput of the 64-port adapter using all four concentrators is 54,500 char/sec.

A single concentrator box saturates at 13300 char/sec with 6% CPU. At 38,400 baud it supports six ports with an effective baud rate of approximately 22 Kbaud. At 19,200 baud it supports eight ports with an effective baud rate of approximately 16.3 Kbaud.

The 128-Port Async Adapter

Up to seven 128-port Digiboard async adapters can be connected to a given RS/6000, for a total of 896 ports.

There are two synchronous-data-link-control (SDLC) links per adapter, with a combined capacity of 2.4 Mbaud. (The 64-port adapter has a four-channel SDLC with a combined capacity of 768 Kbaud.)

Other 128-port features that favorably affect data transmission speed and reduce CPU utilization are:

No concentrator saturation occurs in the 128-port async adapters, giving this adapter the advantage over the 64-port async-adapter systems.

For the measurements, eight 16-port concentrator boxes were connected to the 128 RS232 ports.

128 Half-Duplex Receive

Using XON/XOFF software pacing, this configuration processes about 6908 char/sec per 1% CPU. The peak throughput is 255,600 char/sec at 37% CPU utilization.

128 Half-Duplex Send

With no pacing the maximum rate at which this configuration can send data to a TTY device is approximately 5800 char/sec per 1% CPU utilization. The peak throughput of the 128-port adapter is 255,200 char/sec.

Async Port Tuning Techniques

The test configurations in this study used a number of hardware and software flow-control mechanisms and additional software techniques to optimize character-transmission rates. The guidelines below discuss these techniques. (A shell script containing appropriate stty commands to implement most of the techniques is given at the end of the section.)

fastport for Fast File Transfers

The fastport.s script is intended to condition a TTY port for fast file transfers in raw mode; for example, when a FAX machine is to be connected. Using the script may improve CPU performance by a factor of 3 at 38,400 baud. fastport.s is not intended for the canonical processing that is used when interacting with a user at an async terminal, because canonical processing cannot be easily buffered. The bandwidth of the canonical read is too small for the fast-port settings to make a perceptible difference.

Any TTY port can be configured as a fast port. The improved performance is the result of reducing the number of interrupts to the CPU during the read cycle on a given TTY line.

  1. Create a TTY for the port using SMIT (Devices -> TTY -> Add a TTY), with Enable LOGIN=disable and BAUD rate=38,400.
  2. Create the Korn shell script named fastport.s , as follows:
    #            Configures a fastport for "raw" async I/O.
    set -x
    if [ $i -le 100 ]
    # for the native async ports and the 8-, 16-, and 64-port adapters
    # set vmin=255 and vtime=0.5 secs with the following stty
     stty -g </dev/tty$i |awk ' BEGIN { FS=":";OFS=":" } 
      { $5="ff";$6=5;print $0 } ' >foo
    # for a 128-port adapter, remove the preceding stty, then
    # uncomment and use the 
    # following stty instead to
    # set vmin=255 and vtime=0 to offload line discipline processing
    # stty -g </dev/tty$i |awk ' BEGIN { FS=":";OFS=":" } 
    #  { $5="ff";$6=0;print $0 } ' >foo
     stty `cat foo ` </dev/tty$i
     sleep 2
    # set raw mode with minimal input and output processing
     stty -opost -icanon -isig -icrnl -echo -onlcr</dev/tty$i
       rm foo
       echo "Usage is fastport.s < TTY number >"
  3. Invoke the script for TTY number with the command:
    fastport.s number

[ Previous | Next | Contents | Glossary | Home | Search ]