[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]

System Management Guide: Communications and Networks

TTY Troubleshooting

This section discusses troubleshooting the tty subsystem:

Respawning Too Rapidly Errors

The system records the number of getty processes spawned for a particular tty in a short time period. If the number of getty processes spawned in this time frame exceeds five, then the Respawning Too Rapidly error is displayed on the console and the port is disabled by the system.

The tty stays disabled for about 19 minutes or until the system administrator enables the port again. At the end of the 19 minutes, the system automatically enables the port, resulting in the spawning of a new getty process.

Possible Causes

Each of these possible causes is explained in Procedures for Recovery.

Procedures for Recovery

Error Log Information and TTY Log Identifiers

The following sections discuss important error logging files and commands and common error report messages relating to ttys.

Important Error Logging Files and Commands

Command: errclear

This command deletes entries from the error log. The entire log can be erased with errclear 0 or entries with specified error ID numbers, classes, or types can be removed.

Command: errpt

This command generates an error report from entries in the system error log. The most used format for this command is errpt -a | pg, which generates a detailed report starting with the most current errors.

File: /var/adm/ras/errlog

This file stores instances of errors and failures encountered by system. The errlog file tends to become quite lengthy. If not cleared on a regular basis, it can occupy quite a bit of space on your hard disk. Use the errclear command mentioned previously to clean out this file.

File: /usr/include/sys/errids.h

The errids.h header file correlates error IDs with error labels.

Common Error Report Messages

Message Description Comments
Core Dump Software program abnormally terminated This error is logged when a software program abnormally ends and causes a core dump. Users might not be exiting applications correctly, the system might have been shut down while users were working in application, or the user's terminal might have locked up and the application stopped.
Errlog On Errdaemon turned on This error is logged by the error daemon when the error logging is started. The system automatically turns off error logging during shutdown.
Lion Box Died Lost communication with 64-port concentrator This error is logged by the 64-port concentrator driver if communications with the concentrator are lost. If you receive this error, check the date and time stamp to see if user might have caused this message to occur. A series of these errors can indicate a problem with the 64-port adapter or its associated hardware.
Lion Buffero Buffer overrun: 64-port concentrator This error occurs when the hardware buffer in a 64-port concentrator is overrun. If device and cabling allow, try adding request to send (RTS) handshaking to the port and device. Also try lowering the baud rate.
Lion Chunknumc Bad chunk count: 64-port controller This error occurs when the value for the number of characters in a chunk does not match the actual values in the buffer. This error may indicate a problem with the hardware; try running diagnostics on devices.
Lion Hrdwre Cannot access memory on 64-port controller This error is logged by the 64-port concentrator driver if it is unable to access memory on the 64-port controller.
Lion Mem ADAP Cannot allocate memory: ADAP structure This error is logged by the 64-port concentrator driver if the malloc routine for the adap structure fails.
Lion Mem List Cannot allocate memory: TTYP_T List This error is logged by the 64-port concentrator driver if the malloc routine for the ttyp_t list structure fails
Lion Pin ADAP Cannot pin memory: ADAP structure This error is logged by the 64-port concentrator driver if the pin routine for the adap structure fails.
SRC Software program error This error is logged by the System Resource Controller (SRC) daemon in the event of some abnormal condition. Abnormal conditions are divided in three areas: failing subsystems, communication failures, and other failures.
Lion Unkchunk Unknown error code from the 64-port concentrator Error Code: Number of characters in the chunk received.
TTY Badinput Bad cable or connection The port is generating input faster than the system can consume it, and some of that input is being discarded. Usually, the bad input is caused by one or more RS-232 signals changing their status rapidly and repeatedly in a short period of time, causing your system to spend a lot of time in the interrupt handler. The signal errors are usually caused by a loose or broken connector; a bad, ungrounded, or unshielded cable; or by a "noisy" communications link.
TTY Overrun Receiver overrun on input Most TTY ports have a 16-character input FIFO, and the default setting specifies that an interrupt is posted after 14 characters have been received. This error is reported when the driver interrupt handler cleared the input FIFO and data has been lost. Potential solutions depend on the hardware you are using:
  • 8-port and 128-port adapters

    Verify that flow control is configured correctly. If it is, run diagnostics, and replace the hardware as appropriate.

  • Native ports

    If the problem happens on an idle system, move the workload to a different port. If that corrects the problem, upgrade the system firmware.

  • General solutions
    • Reduce the "RECEIVE trigger level" parameter for this port from 3 to either 2 or 1.
    • Reduce the line speed on this port.
    • Examine other devices and processes to try to reduce the time that the system is spending with interrupts disabled.
TTY TTYHOG TTYHOG overrun This error is usually caused by a mismatch in the flow control method being used between the transmitter and receiver. The TTY driver has made several attempts to ask the transmitter to pause, but the input has not stopped, causing the data to be discarded. Check the flow control methods configured on each end to make sure that the same method is being used on each.
TTY Parerr Parity/Framing error on input This error indicates parity errors on incoming data to asynchronous ports on a character-by-character basis. This is usually caused by a mismatch in line control parameters (parity, line speed, character size, or number of stop bits) between the transmitter and receiver. Line control parameters have to be set the same on both sides in order to communicate.
TTY Prog PTR Driver internal error This error is logged by the tty driver if t_hptr pointer is null.

Clear a Hung tty Port

In this example, assume that the hung tty port is tty0. You must have root authority to be able to complete this procedure.

  1. Determine whether the tty is currently handling any processes by typing the following:
    ps -lt tty0

    This should return results similar to the following:

         F S UID   PID  PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
    240001 S 202 22566  3608   0  60 20 781a   444 70201e44   tty0  0:00 ksh

    The Process ID (PID) here is 22566. To kill this process, type the following:

    kill 22566
    Ensure that the process was successfully cleared by typing the command ps -lt tty0. If the process still exists, add the -9 flag to the kill command as shown in the example below.
    Note
    Do not use the -9 option to kill an slattach process. Killing an slattach process with the -9 flag might cause a slip lock to remain in the /etc/locks file. Delete this lock file to clean up after slattach.
    kill -9 22566
  2. Determine if any process is attempting to use the tty by typing the following:
    ps -ef | grep tty0
    Note
    If the ps -ef | grep tty returns something similar to the following:
    root 19050      1    0    Mar 06       -  0:00 /usr/sbin/getty /dev/tty
    where the "-" is displayed between the date (Mar 06) and the time (0:00), this tty does not have the correct cable. This status indicates that the system login process (getty) is attempting to open this tty, and the open process is hanging because the RS-232 signal Data Carrier Detect (DCD) is not asserted. You can fix this by using the correct null modem adapter in the cabling. When getty can open the tty port, the "-" is replaced by the tty number. For more information on cables, see Attach the Modem with Appropriate Cables.
    Note
    The following command can be used to disable the login process on tty0.
    pdisable tty0
    If the process has been successfully cleared but the tty is still unresponsive, continue to the next step.
  3. Type the following command:
    fuser -k /dev/tty0
    This will clear any process that can be found running on the port and display the PID. If the tty is still unusable, continue to the next step.
  4. Use the strreset command to flush outgoing data from the port that is hung due to data that cannot be delivered because the connection to the remote end has been lost.
    Note
    If the strreset command fixes the hung port, the port has a cable or configuration problem because the loss of the connection to the remote end should have caused buffered data to be flushed automatically.

    You need to first determine the major and minor device numbers for the tty by typing the following:

    ls -al /dev/tty0

    Your results should look similar to the following:

    crw-rw-rw-    1 root     system    18,   0 Nov  7  06:19 /dev/tty0

    This indicates that tty0 has a major device number of 18 and a minor device number of 0. Specify these numbers when using the strreset command as follows:

    /usr/sbin/strreset -M 18 -m 0
    If the tty is still unusable, continue to the next step.
  5. Detach and reattach the cable from the hung tty port. AIX uses the Data Carrier Detect (DCD) signal to determine the presence of a device attached to the port. By dropping DCD, detaching and reattaching the cable will in many cases clear hung processes.

    To determine the location of the port on which the tty is configured, type the following command:

    lsdev -Cl tty0

    The results should look similar to the following:

    tty0    Available   00-00-S1-00  Asynchronous Terminal

    The third column in the above output indicates the location code of the tty. In this example, S1 indicates the serial port is configured for native serial port 1. For more information on interpreting location codes, see the Location Codes document in AIX 5L Version 5.2 System Management Concepts: Operating System and Devices.

    If the tty is still unusable, continue to the next step.

  6. Flush the port using stty-cxma. Type the following:
    /usr/lbin/tty/stty-cxma flush tty0

    This command is intended for the ttys configured on ports of the 8-port and 128-adapters. In some cases, however, it can be used successfully to flush other tty ports.

    If the tty is still unusable, continue to the next step.

  7. On the keyboard of the hung terminal, hold down the Ctrl key and press Q. This will resume any suspended output by sending an Xon character.

    If the tty is still unusable, continue to the next step.

  8. A program will sometimes open a tty port, modify some attributes, and close the port without resetting the attributes to their original states. To correct this, bring the tty down to a DEFINED state and then make it available by typing the following:
    rmdev -l tty0
    This command leaves the information concerning the tty in the database but makes the tty unavailable on the system.

    The following command reactivates the tty:

    mkdev -l tty0
    If the tty is still unusable, consider moving the device to another port and configuring a tty at that location until the system can be rebooted. If rebooting does not clear the port, you most likely have a hardware problem. Check the error report for port hardware problems by entering the following:
    errpt -a | pg
    Note
    Some of the preceding commands will not work, and they will give a method error indicating that the device is busy. This is because of the process running on the tty. If none of the steps detailed above free the hung tty, as a last resort, reboot the AIX system and flush the kernel so that the process will go away.

[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]